Tag Archives: MySQL

MySQL too many connections – solution

When greeted with a MySQL error this morning stating that a database connection could not be made from our PHP web application I had to do some testing.  First, try connecting to the database.  I did this from a remote host and was thrown back an “Error: too many connections.”

From there you should login as root from the localhost and issue:

SHOW FULL PROCESSLIST;

This will give a list of the connections. Take a good look at the big offenders because you’ll need to troubleshoot that application later and find out why it has so many connections.  Now that you know the problem, restart your MySQL and connections should resume as normal (with the excessive stale connections released) and you can go about the repair.

Also, in your my.cnf you can reduce the wait_timeout which will reduce the amount of time a connection can be held open by an application. The default is 28800 seconds, or 8 hours. I reduced mine to 30 minutes. You could go less but we have some apps that do their own connection pooling and don’t want to mess that up. In my.cnf:

wait_timeout = 1800

If you are absolutely sure you need more connections, you can increase this from the default 151 in your my.cnf as well:

# default max_connections is 151 (1 spare for super user)
max_connections = 201

Now you need to keep an eye on the situation by issuing the SHOW FULL PROCESSLIST once in a while to see if they are filling up fast. And don’t forget to visit any applications that may be showing up in there frequently.  I found two applications tha were using persistent connections, being held open for two hours, and they just didn’t need to.

Mysql Client Switch for Vertical Output

Did you ever need to examine one line from a MySQL table and were unable to line up the column headers with the data because the line was too long?

Well no fears, it turns out the MySQL client has a switch (\G) you can use to arrange output vertically for easier reading. This is very handy when you are looking at only one line. Here’s an example:


mysql> SELECT * FROM emp_vw WHERE id = 10110 LIMIT 1 \G
*************************** 1. row ***************************
id: 10110
company_id: 21100
last_name: Stall
first_name: John
gender: MALE
email: test@example.com
address: 277 142nd Ave
city: New York
state: NY
phone: (888)555-1212
birthday: 1963-05-12
active: 1
timestamp: 2011-11-30 18:58:50
1 row in set (0.01 sec)

This is a great way to examine the content of the fields of any table for the purposes of investigation. It works very well for views and tables with many columns that form long lines when displayed horizontally. This can also be applied to your my.cnf file, but I prefer to leave it off so it can be enable with the MySQL client \G switch only when needed.

MySQL ERROR 1396 (HY000): Operation CREATE USER failed

While creating a new user in MySQL today I accidentally pressed the return button while in the middle of entering a password via the command line client, which looked like this:


mysql> CREATE USER 'userguy'@'10.20.30.40' IDENTIFIED BY '
'> ;
'> '''';;
'> ';
Query OK, 0 rows affected (0.00 sec)

Which isn’t what I wanted. Terminating the command didn’t work and the command succeeded, not what I wanted. So I had to fix it.

I issued a

DROP USER

command and was greeted with:

mysql> drop user userguy;
ERROR 1396 (HY000): Operation DROP USER failed for 'userguy'@'%'

That didn’t work either. Instead I did this:

delete from mysql.user where User = 'userguy';
delete from mysql.db where User = 'userguy';

And when I tried to create the user again I was still met with ERROR 1396.

How to solve MySQL ERROR 1396 (HY000): Operation CREATE USER failed

Once those user entries are removed from the User and db tables, you just have to flush privileges:

mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)

Now when you enter your CREATE USER command (with the correct syntax and a valid single line password), it should succeed.

And stop working when you are tired!

Creating MySQL Users and Granting Permissions

When should you create a new user in MySQL? That’s a question that you have to ask whenever you add a new application or restructure your data. I’ve been in places where every application connects as root, and obviously that’s very bad.

Only you know your situation. When I build web applications sometimes I’ll have a shared database, such as geographic information or lists or musical artists, that I want multiple applications to have access to. But only read access, they can’t change it. So root is out. But they also have their own database (in MySQL) or schema (in PostGreSQL) in which they will need to INSERT, UPDATE, DELETE and SELECT from tables. They may even need to CREATE tables, such as temp tables, depending on the situation. In these cases, such applications would require a dedicated user, because their are no commonalities that would allow this user account with its privileges to be reused by another application.

In this guide I’ll only deal with command line actions from MySQL client, not CPanel actions, but you’ll be able to transfer the concept to the different interface.

To create users you’ll most likely have to be root:

mysql -u root -p

Create the new user for this application:

CREATE USER 'username'@'hostname' IDENTIFIED BY 'password';

You’ll have to think about this

hostname

setting. If the only one accessing your database with this username is your application, and your MySQL process is on your local machine, then ‘localhost’ will do as a hostname. If you have a separate machine running MySQL as I do, then you’ll need to enter the hostname or IP address of your web server machine, as that is where the application will connect from.

CREATE USER 'username'@'webserver' IDENTIFIED BY 'password';

It is possible to use wildcards for the hostname, although you should think this through as generally it is a bad idea. If you are on an internal network and you have your web server cluster and SQL machine on the same subnet, you could do this:

CREATE USER 'username'@'10.20.30.%' IDENTIFIED BY 'password';

Which will allow 256 machines to connect to your MySQL server as that user. Think about the future use of your network before you do this.

GRANTing PRIVILEGES

Now you have a user who can’t do much. You’ll need to give them access to something. Since the user I’m creating will be the one populating the database with information and not just reading it, they will need greater privileges than the average web application that just reads and reports. The GRANT syntax can be found in the MySQL manual. My specific requirement allowed me to enter:

mysql> GRANT ALTER, CREATE TEMPORARY TABLES, DELETE, INSERT, LOCK TABLES, SELECT, UPDATE ON dbname.* TO userguy@10.20.30.40

  • This GRANT is followed by a list of actions the user can take as seen in the manual page quoted above, (SELECT, DELETE, etc).
  • ON a certain database and tables – dbname.* means these privileges apply to ALL tables on the database called dbname.
  • TO username @ IP address – meaning if this user connects from another address they won’t receive these privileges

DON’T enter:
mysql> GRANT ALL ON *.* TO userguy@%

Which is the equivalent of giving them root privileges on all tables of all databases when connecting from anywhere. Pointless to try and manage users this way. And never grant *.* to a user anyway unless you aware of what all the * privileges are. Most people would be mortified to find out they used *.* and granted their web user the ability to DROP all their databases. Of course they don’t find out until after it has happened if they don’t know the user had that ability with *.* to start with.

Now, just to be clear everything is in effect, issue the

FLUSH PRIVILEGES

command.

Keep in mind, that in newer versions, MYSQL will create a user when a GRANT is issued if the user doesn’t already exist. However, I prefer to explicitly CREATE a user, and keep track of it in a file.

Connecting remotely requires configuration settings changes. If you are not connecting locally, you’ll need to modify your MySQL’s my.cnf file to allow remote connections. This is part of your extended configuration and security, and I won’t discuss it in this article.

Now you can go about creating users and granting privileges with a total security plan in mind. Everyone can access only what they need, no one can access or change what they shouldn’t, and users can only connect from specific locations.

Optimizing Mysql Data Types

Optimizing MySQL datatypes is an important task for anyone, whether you are running a single website or a cluster of geographically distributed MySQL machines. Efficiency in your database driven application all begins with selecting the right datatypes for your columns.

In the process of a recent project I had to get data from a third party source and store it in a local MySQL database, for which we chose to build a dedicated MySQL server machine with a RAID 10 disk array. I had no idea what the data we retrieve would look like. All we had was a list of columns and nothing else. In order to get started we had to guess at the length of the data based on the column names assuming they were descriptive of the data they held. An initial table contained column names followed by simple VARCHAR(255) declarations. Totally unacceptable, but it was a start.

What you got there?

Once the data came back I had to examine each column in each table to see what they held. There are many ways of doing this including using MAX() and MIN() funtions like so:


mysql> SELECT MAX(LENGTH(first_name)) FROM emp;
+-------------------------+
| MAX(LENGTH(first_name)) |
+-------------------------+
| 14 |
+-------------------------+

This tells me that the maximum length of an employees first name is only 16 characters long. So why a VARCHAR(255) or a CHAR(100) or anything else? Let’s look at an example of the longest names in the table:


mysql> SELECT DISTINCT(first_name) FROM emp WHERE LENGTH(first_name) = 14;
+------------------+
| first_name |
+------------------+
| Thomas Michael |
+------------------+

That query gives us only the distinct uses of first_name that are 14 characters long. An examination of the real data is important when selecting datatypes, so you may want to do this again using > 10 in place of = 14, and you can even include a count of how many employees have large first_names, and any other queries that may help you understand the data you are working with.

The Average Joe

Another useful query is to determine the average length of the first name column:


mysql> SELECT AVG(LENGTH(first_name)) FROM emp;
+-------------------------+
| AVG(LENGTH(first_name)) |
+-------------------------+
| 5.9771 |
+-------------------------+

With the average length of a column being only 6 characters, does it make sense to define a fixed field of CHAR(14) to accomodate the longest field when only one employee has a first name that long? Would it be better to use a VARCHAR column anyway? A quick examination of how many names have each character count is required:


SELECT LENGTH(first_name) AS size,
COUNT(*)
FROM emp
GROUP BY LENGTH(first_name);

+——+———-+
| size | COUNT(*) |
+——+———-+
| 1 | 0 |
| 2 | 21 |
| 3 | 85 |
| 4 | 626 |
| 5 | 1259 |
| 6 | 1626 |
| 7 | 958 |
| 8 | 345 |
| 9 | 183 |
| 10 | 38 |
| 11 | 37 |
| 12 | 9 |
| 13 | 9 |
| 14 | 1 |
+——+———-+

 

This tells me that most people in our system have first names with an average character length between 4 and 7. Posessing and understanding this information will allow me to make decisions on the best column type. This information is also valuable when creating partial indexes on VARCHAR columns.

Procedure Analyse

One function available with MySQL is PROCEDURE ANALYSE. The manual for it is here:

http://dev.mysql.com/doc/refman/5.0/en/procedure-analyse.html

Replace 5.0 in the URL with your version.

The description from the manual for ANALYSE is:

ANALYSE() examines the result from a query and returns an analysis of the results that suggests optimal data types for each column that may help reduce table sizes.

That just took the guesswork out of your column declaration optimizations! Knowing your data is still critical because ANALYSE is still capable of guessing wrong, but it is another tool to assist you. To view the output of ANALYSE for the first name column, issue a query as follows:


SELECT first_name
FROM emp
PROCEDURE ANALYSE();

+————————+———–+———–+————+————+——————+——-+————————-+——+———————-+
| Field_name | Min_value | Max_value | Min_length | Max_length | Empties_or_zeros | Nulls | Avg_value_or_avg_length | Std | Optimal_fieldtype |
+————————+———–+———–+————+————+——————+——-+————————-+——+———————-+
| schema.emp.first_name | A | Zyrida | 1 | 16 | 0 | 0 | 5.9771 | NULL | VARCHAR(16) NOT NULL |
+————————+———–+———–+————+————+——————+——-+————————-+——+———————-+
1 row in set (0.01 sec)

As you can see in the output, PROCEDURE ANALYSE() returns a single row with:

  • the column name
  • the MIN and MAX values (not important for strings)
  • the MIN and MAX lengths (queries we ran manually above)
  • how many columns are empty or NULL
  • the AVG value (for numbers) or AVG length (for strings)
  • the population standard deviation (STD)
  • a recommended column definition called Optimal_fieldtype

Now, you have to be careful with this procedure when considering accepting the Optimal_fieldtype when passing the default arguments to PROCEDURE ANALYSE(). In a lot of cases, even in the case of a BOOLEAN type, it will return an ENUM, such as ENUM(‘0’, ‘1’). Close attention to detail is important in these cases. In the following example the column only holds one possible values at this time. It is an unfortunate case, but the data that is being returned to me uses this as a foreign key and as part of a primary key on many tables.


SELECT id
FROM company
PROCEDURE ANALYSE();
+--------------+-----------+-----------+------------+------------+------------------+-------+-------------------------+--------+------------------------+
| Field_name | Min_value | Max_value | Min_length | Max_length | Empties_or_zeros | Nulls | Avg_value_or_avg_length | Std | Optimal_fieldtype |
+--------------+-----------+-----------+------------+------------+------------------+-------+-------------------------+--------+------------------------+
| schema.company.id | 20000 | 20000 | 5 | 5 | 0 | 0 | 20000.0000 | 0.0000 | ENUM('20000') NOT NULL |
+--------------+-----------+-----------+------------+------------+------------------+-------+-------------------------+--------+------------------------+
1 row in set (0.00 sec)

In actuality, this ID is a numeric value, and further numeric ID values may be added over time, so a numeric data type would be more approriate than an ENUM. In order for PROCEDURE ANALYSE to return a correct number to us we’ll have to pass some arguments to it. According to the manual, the arguments are as follows:

max_elements (default 256) is the maximum number of distinct values that ANALYSE() notices per column. This is used by ANALYSE() to check whether the optimal data type should be of type ENUM; if there are more than max_elements distinct values, then ENUM is not a suggested type.

max_memory (default 8192) is the maximum amount of memory that ANALYSE() should allocate per column while trying to find all distinct values.

Retrying that query with arguments (1,24) we get:


SELECT id
FROM company
PROCEDURE ANALYSE(1,24);

+————–+———–+———–+————+————+——————+——-+————————-+——–+——————————-+
| Field_name | Min_value | Max_value | Min_length | Max_length | Empties_or_zeros | Nulls | Avg_value_or_avg_length | Std | Optimal_fieldtype |
+————–+———–+———–+————+————+——————+——-+————————-+——–+——————————-+
| schema.company.id | 20000 | 2000 | 5 | 5 | 0 | 0 | 20000.0000 | 0.0000 | SMALLINT(5) UNSIGNED NOT NULL |
+————–+———–+———–+————+————+——————+——-+————————-+——–+——————————-+
1 row in set (0.00 sec)

As you can see, optimize is now suggesting an Optimal fieldtype of SMALLINT(5) UNSIGNED NOT NULL which has a numeric value range of 0 to 65535 which is adequate for the purposes of our current and future ID allocation in this column.

Planning for the future

One thing to remember is that the data you are seeing today isn’t necessarily representative of the data you’ll get in the future, unless you have a really large set defined by someone else. You’ll have to be careful with the data types you define. Don’t make them unnecessarily large, but you can plan some wiggle room from the PROCEDURE ANALYSE optimal value. Use your mind here.

If you are creating your own data from scratch such as with a form entered on a website, then this technique will be most usefully applied by you after you’ve been up and running for some time and have some real data to analyze. If you are working with a small schema you’ll be able to ALTER your columns to the new data types after the fact.