Category Archives: SQL

Postfix, Dovecot and PostGreSQL Setup

In trying to setup a simple mail server on Ubuntu for a couple of small virtual domains with limited users I came across a couple excellent articles, but they were written for use with MySQL.  I have a machine with limited resources running custom software using PostGreSQL.  Although flat files would have been acceptable, I didn’t have the same luck in breadth of documentation.  So, I decided to work with the excellent articles I found and adapt them to Postgres for my server.

The best guide I found was from Linode, and is geared to help customers setting up Email with Postfix, Dovecot, and MySQL on their excellent platform.  The second guide I found left a lot to be desired, but since it was Postgres specific I was able to take those pieces from it and adapt it to the Linode guide.  That guide was called How to setup a postfix email server with Dovecot and published as a community tutorial from DigitalOcean.  While DigitalOcean and Linode are tight competitors, Linode wins out for having the better mail server how-to.

I’m going to mainly show what changes need to be done to the Linode guide in order to make it work with Postgres.  Otherwise use that guide for complete instructions. I’ll assume you already have a Postgres server up and running on that machine.

Postgres – Creating the Database

Your first change will be under the “Creating the Database” heading.  Let’s start by using the DigitalOcean guide to create the user.  Let’s try and keep the Linode naming convention as much as possible though.

First follow the adjustments to the  /etc/postgresql/vers.ion/main/pg_ident.conf  file.

mailmap         dovecot                 mailuser
mailmap         postfix                 mailuser
mailmap         root                    mailuser

Reload postgres and setup the user.

# sudo -u postgres psql
postgres=# CREATE USER mailuser;
postgres=# REVOKE CREATE ON SCHEMA public FROM PUBLIC;
postgres=# REVOKE USAGE ON SCHEMA public FROM PUBLIC;
postgres=# GRANT CREATE ON SCHEMA public TO postgres;
postgres=# GRANT USAGE ON SCHEMA public TO postgres;
postgres=# CREATE DATABASE mailserver WITH OWNER mailuser;
postgres=# \q 
# sudo psql -U mailuser -d mailserver
postgres=# \c mailserver
CREATE SEQUENCE seq_mail_domain_id START 1;
CREATE SEQUENCE seq_mail_user_id START 1;
CREATE SEQUENCE seq_mail_alias_id START 1;

CREATE TABLE virtual_domains (
  domain_id INT2 NOT NULL DEFAULT nextval('seq_mail_domain_id'),
  domain_name varchar(50) NOT NULL,
  PRIMARY KEY (domain_id)
);


CREATE TABLE virtual_users (
  user_id INT2 NOT NULL DEFAULT nextval('seq_mail_user_id'),
  domain_id INT2 NOT NULL,
  password varchar(106) NOT NULL,
  email varchar(100) NOT NULL,
  PRIMARY KEY (user_id),
  FOREIGN KEY (domain_id) REFERENCES virtual_domains(domain_id) ON DELETE CASCADE
);

CREATE TABLE virtual_aliases (
  alias_id INT2 NOT NULL DEFAULT nextval('seq_mail_alias_id'),
  domain_id INT2 NOT NULL,
  source varchar(100) NOT NULL,
  destination varchar(100) NOT NULL,
  PRIMARY KEY (alias_id),
  FOREIGN KEY (domain_id) REFERENCES virtual_domains(domain_id) ON DELETE CASCADE
);

To add your domains to the database:

INSERT INTO virtual_domains (domain_name)
VALUES
  ('example.com'),
  ('hostname.example.com'),
  ('hostname'),
  ('localhost.example.com');

Do a simple select from the virtual_domains table and note the id of the domain you wish to add addresses for:

SELECT * FROM virtual_domains;

According to the postgres document we referenced earlier, you have to get the password from command line like so:

# doveadm pw -s sha512 -r 100
Enter new password: ...
Retype new password: ...
{SHA512}.............................................................==

Now you need to add some email addresses like so:

INSERT INTO virtual_users
  ( domain_id, password , email)
VALUES
  ( '1', '{SHA512}.............................................................==', 'postmaster@example.com'),
  ( '1', '{SHA512}.............................................................==', 'supah_user@example.com');

Now, if you created your user according to the digital ocean document there is no password.  You’ll have to add one for your user like this:

ALTER ROLE mailuser WITH PASSWORD 'supersecretPaSSvvord';

That should be it for your direct Postgres editing. You can follow all the testing they show for MySQL since it looks like standard SQL.

Postfix

You can follow the Linode guide for postfix, only needing to make changes in step 8.  Just change in main.cf where they name their files by prepending mysql: to pgsql: and changing the filenames. Everything else there is the same.

  #Virtual domains, users, and aliases
  virtual_mailbox_domains = pgsql:/etc/postfix/pgsql-virtual-mailbox-domains.cf
  virtual_mailbox_maps = pgsql:/etc/postfix/pgsql-virtual-mailbox-maps.cf
  virtual_alias_maps = pgsql:/etc/postfix/pgsql-virtual-alias-maps.cf

Watch for these filenames further down as you’ll have to remember the change.  The actualy content of the files including the queries themselves from steps 11 on are the same as for MySQL.

The Linode document had a great line to test your db connection in step 15 and on.  Issue this from the shell:

postmap -q cmydomain.com pgsql:/etc/postfix/pgsql-virtual-mailbox-domains.cf

Dovecot

Keep in mind when working with dovecot configuration that if you don’t intend to use pop3, don’t enable it.

In step 25 specify pgsql instead of mysql in the driver line.

In step 27, Note in the dovecot-sql.conf.ext file your default_pass_scheme using the method above from the DigitalOcean document is SHA512,  not SHA512-CRYPT as in the Linode document.

Done!

That’s it! Everything else should be the same.  Follow the Linode guide to test your connections, make sure your firewall is open, and if you have any problems they have an excellent troubleshooting guide linked from the original article.  Enjoy!

Find Conficker Infected Machines with SGUIL

This command line query for mysql will grab for you a list of conficker infected machines for a given date range, their IP address, the count of events in the logs for each machine, and sort them by the biggest offenders. Distribution is SecurityOnion Linux.

# -A turns off "reading table information for completion of table and column names" for faster DB selection
 
mysql -uroot -A
 
use securityonion_db;

# change date range as needed
SELECT 
  INET_NTOA(event.src_ip),
  count(INET_NTOA(event.src_ip)) AS total
FROM event IGNORE INDEX (event_p_key, sid_time) 
WHERE event.timestamp > '2013-04-15' AND event.timestamp < '2013-04-16' 
AND  event.signature like '%Conficker%' 
GROUP BY INET_NTOA(event.src_ip)
ORDER BY INET_NTOA(event.src_ip) ASC

I was unable to get a list of host names of machines, which would have been nice when you have a large list of DHCP clients and aren’t looking at this query until many days after.

Another important query for security purposes is to obtain a list of IP addresses which the Conficker infected machines on your network are trying to contact. In this case I’m going to leave out the date range condition since I’m looking for the IP’s that have had the most activity of all time so I can ban them.  Network wide. For fun. Just ’cause I can.

SELECT 
  INET_NTOA(event.dst_ip),
  count(INET_NTOA(event.dst_ip)) AS total
FROM event IGNORE INDEX (event_p_key, sid_time) 
WHERE event.signature like '%Conficker%' 
GROUP BY INET_NTOA(event.dst_ip)
ORDER BY INET_NTOA(event.dst_ip) ASC

MySQL too many connections – solution

When greeted with a MySQL error this morning stating that a database connection could not be made from our PHP web application I had to do some testing.  First, try connecting to the database.  I did this from a remote host and was thrown back an “Error: too many connections.”

From there you should login as root from the localhost and issue:

SHOW FULL PROCESSLIST;

This will give a list of the connections. Take a good look at the big offenders because you’ll need to troubleshoot that application later and find out why it has so many connections.  Now that you know the problem, restart your MySQL and connections should resume as normal (with the excessive stale connections released) and you can go about the repair.

Also, in your my.cnf you can reduce the wait_timeout which will reduce the amount of time a connection can be held open by an application. The default is 28800 seconds, or 8 hours. I reduced mine to 30 minutes. You could go less but we have some apps that do their own connection pooling and don’t want to mess that up. In my.cnf:

wait_timeout = 1800

If you are absolutely sure you need more connections, you can increase this from the default 151 in your my.cnf as well:

# default max_connections is 151 (1 spare for super user)
max_connections = 201

Now you need to keep an eye on the situation by issuing the SHOW FULL PROCESSLIST once in a while to see if they are filling up fast. And don’t forget to visit any applications that may be showing up in there frequently.  I found two applications tha were using persistent connections, being held open for two hours, and they just didn’t need to.

How To Disable WordPress Plugins Manually

If you get stuck being unable to login to WordPress because of a misbehaving plugin, you have a couple options.

You can login to your WordPress database using command line MySQL client or phpMyAdmin, and from there look for the wp_options table. If you are using a security plugin that changes table prefixes you’ll have to replace wp_ with your prefix. Once there, look for a value in the option_name column called active_plugins. You can find it like this:

SELECT * FROM wp_options WHERE option_name like '%plugin%';

This will give you a list of plugin options which you can examine. If you are using WPMU or WP3 Multisite and this is a globally enabled plugin, you’ll need to then use table name with the ID of the site in the prefix, like wp_2_options instead of wp_options. It used to be that option_name “active_plugins” would contain all plugins, but it appears from examining this that option_name “transient_plugin_slugs” contains plugins. If you choose to modify it, copy and paste the original value somewhere for easy editing and replacement once fixed.

The option I chose for this Multisite plugin that prevented login was to simply rename the folder of it in the plugins directory. That would cause WordPress not to load the bad files, because they were no longer in the expected location.

Once into the admin dashboard page, you can then disable the plugin completely and take whatever actions you need to repair, replace or trash it for good.

Mysql Client Switch for Vertical Output

Did you ever need to examine one line from a MySQL table and were unable to line up the column headers with the data because the line was too long?

Well no fears, it turns out the MySQL client has a switch (\G) you can use to arrange output vertically for easier reading. This is very handy when you are looking at only one line. Here’s an example:


mysql> SELECT * FROM emp_vw WHERE id = 10110 LIMIT 1 \G
*************************** 1. row ***************************
id: 10110
company_id: 21100
last_name: Stall
first_name: John
gender: MALE
email: test@example.com
address: 277 142nd Ave
city: New York
state: NY
phone: (888)555-1212
birthday: 1963-05-12
active: 1
timestamp: 2011-11-30 18:58:50
1 row in set (0.01 sec)

This is a great way to examine the content of the fields of any table for the purposes of investigation. It works very well for views and tables with many columns that form long lines when displayed horizontally. This can also be applied to your my.cnf file, but I prefer to leave it off so it can be enable with the MySQL client \G switch only when needed.

Creating MySQL Users and Granting Permissions

When should you create a new user in MySQL? That’s a question that you have to ask whenever you add a new application or restructure your data. I’ve been in places where every application connects as root, and obviously that’s very bad.

Only you know your situation. When I build web applications sometimes I’ll have a shared database, such as geographic information or lists or musical artists, that I want multiple applications to have access to. But only read access, they can’t change it. So root is out. But they also have their own database (in MySQL) or schema (in PostGreSQL) in which they will need to INSERT, UPDATE, DELETE and SELECT from tables. They may even need to CREATE tables, such as temp tables, depending on the situation. In these cases, such applications would require a dedicated user, because their are no commonalities that would allow this user account with its privileges to be reused by another application.

In this guide I’ll only deal with command line actions from MySQL client, not CPanel actions, but you’ll be able to transfer the concept to the different interface.

To create users you’ll most likely have to be root:

mysql -u root -p

Create the new user for this application:

CREATE USER 'username'@'hostname' IDENTIFIED BY 'password';

You’ll have to think about this

hostname

setting. If the only one accessing your database with this username is your application, and your MySQL process is on your local machine, then ‘localhost’ will do as a hostname. If you have a separate machine running MySQL as I do, then you’ll need to enter the hostname or IP address of your web server machine, as that is where the application will connect from.

CREATE USER 'username'@'webserver' IDENTIFIED BY 'password';

It is possible to use wildcards for the hostname, although you should think this through as generally it is a bad idea. If you are on an internal network and you have your web server cluster and SQL machine on the same subnet, you could do this:

CREATE USER 'username'@'10.20.30.%' IDENTIFIED BY 'password';

Which will allow 256 machines to connect to your MySQL server as that user. Think about the future use of your network before you do this.

GRANTing PRIVILEGES

Now you have a user who can’t do much. You’ll need to give them access to something. Since the user I’m creating will be the one populating the database with information and not just reading it, they will need greater privileges than the average web application that just reads and reports. The GRANT syntax can be found in the MySQL manual. My specific requirement allowed me to enter:

mysql> GRANT ALTER, CREATE TEMPORARY TABLES, DELETE, INSERT, LOCK TABLES, SELECT, UPDATE ON dbname.* TO userguy@10.20.30.40

  • This GRANT is followed by a list of actions the user can take as seen in the manual page quoted above, (SELECT, DELETE, etc).
  • ON a certain database and tables – dbname.* means these privileges apply to ALL tables on the database called dbname.
  • TO username @ IP address – meaning if this user connects from another address they won’t receive these privileges

DON’T enter:
mysql> GRANT ALL ON *.* TO userguy@%

Which is the equivalent of giving them root privileges on all tables of all databases when connecting from anywhere. Pointless to try and manage users this way. And never grant *.* to a user anyway unless you aware of what all the * privileges are. Most people would be mortified to find out they used *.* and granted their web user the ability to DROP all their databases. Of course they don’t find out until after it has happened if they don’t know the user had that ability with *.* to start with.

Now, just to be clear everything is in effect, issue the

FLUSH PRIVILEGES

command.

Keep in mind, that in newer versions, MYSQL will create a user when a GRANT is issued if the user doesn’t already exist. However, I prefer to explicitly CREATE a user, and keep track of it in a file.

Connecting remotely requires configuration settings changes. If you are not connecting locally, you’ll need to modify your MySQL’s my.cnf file to allow remote connections. This is part of your extended configuration and security, and I won’t discuss it in this article.

Now you can go about creating users and granting privileges with a total security plan in mind. Everyone can access only what they need, no one can access or change what they shouldn’t, and users can only connect from specific locations.

Optimizing Mysql Data Types

Optimizing MySQL datatypes is an important task for anyone, whether you are running a single website or a cluster of geographically distributed MySQL machines. Efficiency in your database driven application all begins with selecting the right datatypes for your columns.

In the process of a recent project I had to get data from a third party source and store it in a local MySQL database, for which we chose to build a dedicated MySQL server machine with a RAID 10 disk array. I had no idea what the data we retrieve would look like. All we had was a list of columns and nothing else. In order to get started we had to guess at the length of the data based on the column names assuming they were descriptive of the data they held. An initial table contained column names followed by simple VARCHAR(255) declarations. Totally unacceptable, but it was a start.

What you got there?

Once the data came back I had to examine each column in each table to see what they held. There are many ways of doing this including using MAX() and MIN() funtions like so:


mysql> SELECT MAX(LENGTH(first_name)) FROM emp;
+-------------------------+
| MAX(LENGTH(first_name)) |
+-------------------------+
| 14 |
+-------------------------+

This tells me that the maximum length of an employees first name is only 16 characters long. So why a VARCHAR(255) or a CHAR(100) or anything else? Let’s look at an example of the longest names in the table:


mysql> SELECT DISTINCT(first_name) FROM emp WHERE LENGTH(first_name) = 14;
+------------------+
| first_name |
+------------------+
| Thomas Michael |
+------------------+

That query gives us only the distinct uses of first_name that are 14 characters long. An examination of the real data is important when selecting datatypes, so you may want to do this again using > 10 in place of = 14, and you can even include a count of how many employees have large first_names, and any other queries that may help you understand the data you are working with.

The Average Joe

Another useful query is to determine the average length of the first name column:


mysql> SELECT AVG(LENGTH(first_name)) FROM emp;
+-------------------------+
| AVG(LENGTH(first_name)) |
+-------------------------+
| 5.9771 |
+-------------------------+

With the average length of a column being only 6 characters, does it make sense to define a fixed field of CHAR(14) to accomodate the longest field when only one employee has a first name that long? Would it be better to use a VARCHAR column anyway? A quick examination of how many names have each character count is required:


SELECT LENGTH(first_name) AS size,
COUNT(*)
FROM emp
GROUP BY LENGTH(first_name);

+——+———-+
| size | COUNT(*) |
+——+———-+
| 1 | 0 |
| 2 | 21 |
| 3 | 85 |
| 4 | 626 |
| 5 | 1259 |
| 6 | 1626 |
| 7 | 958 |
| 8 | 345 |
| 9 | 183 |
| 10 | 38 |
| 11 | 37 |
| 12 | 9 |
| 13 | 9 |
| 14 | 1 |
+——+———-+

 

This tells me that most people in our system have first names with an average character length between 4 and 7. Posessing and understanding this information will allow me to make decisions on the best column type. This information is also valuable when creating partial indexes on VARCHAR columns.

Procedure Analyse

One function available with MySQL is PROCEDURE ANALYSE. The manual for it is here:

http://dev.mysql.com/doc/refman/5.0/en/procedure-analyse.html

Replace 5.0 in the URL with your version.

The description from the manual for ANALYSE is:

ANALYSE() examines the result from a query and returns an analysis of the results that suggests optimal data types for each column that may help reduce table sizes.

That just took the guesswork out of your column declaration optimizations! Knowing your data is still critical because ANALYSE is still capable of guessing wrong, but it is another tool to assist you. To view the output of ANALYSE for the first name column, issue a query as follows:


SELECT first_name
FROM emp
PROCEDURE ANALYSE();

+————————+———–+———–+————+————+——————+——-+————————-+——+———————-+
| Field_name | Min_value | Max_value | Min_length | Max_length | Empties_or_zeros | Nulls | Avg_value_or_avg_length | Std | Optimal_fieldtype |
+————————+———–+———–+————+————+——————+——-+————————-+——+———————-+
| schema.emp.first_name | A | Zyrida | 1 | 16 | 0 | 0 | 5.9771 | NULL | VARCHAR(16) NOT NULL |
+————————+———–+———–+————+————+——————+——-+————————-+——+———————-+
1 row in set (0.01 sec)

As you can see in the output, PROCEDURE ANALYSE() returns a single row with:

  • the column name
  • the MIN and MAX values (not important for strings)
  • the MIN and MAX lengths (queries we ran manually above)
  • how many columns are empty or NULL
  • the AVG value (for numbers) or AVG length (for strings)
  • the population standard deviation (STD)
  • a recommended column definition called Optimal_fieldtype

Now, you have to be careful with this procedure when considering accepting the Optimal_fieldtype when passing the default arguments to PROCEDURE ANALYSE(). In a lot of cases, even in the case of a BOOLEAN type, it will return an ENUM, such as ENUM(‘0’, ‘1’). Close attention to detail is important in these cases. In the following example the column only holds one possible values at this time. It is an unfortunate case, but the data that is being returned to me uses this as a foreign key and as part of a primary key on many tables.


SELECT id
FROM company
PROCEDURE ANALYSE();
+--------------+-----------+-----------+------------+------------+------------------+-------+-------------------------+--------+------------------------+
| Field_name | Min_value | Max_value | Min_length | Max_length | Empties_or_zeros | Nulls | Avg_value_or_avg_length | Std | Optimal_fieldtype |
+--------------+-----------+-----------+------------+------------+------------------+-------+-------------------------+--------+------------------------+
| schema.company.id | 20000 | 20000 | 5 | 5 | 0 | 0 | 20000.0000 | 0.0000 | ENUM('20000') NOT NULL |
+--------------+-----------+-----------+------------+------------+------------------+-------+-------------------------+--------+------------------------+
1 row in set (0.00 sec)

In actuality, this ID is a numeric value, and further numeric ID values may be added over time, so a numeric data type would be more approriate than an ENUM. In order for PROCEDURE ANALYSE to return a correct number to us we’ll have to pass some arguments to it. According to the manual, the arguments are as follows:

max_elements (default 256) is the maximum number of distinct values that ANALYSE() notices per column. This is used by ANALYSE() to check whether the optimal data type should be of type ENUM; if there are more than max_elements distinct values, then ENUM is not a suggested type.

max_memory (default 8192) is the maximum amount of memory that ANALYSE() should allocate per column while trying to find all distinct values.

Retrying that query with arguments (1,24) we get:


SELECT id
FROM company
PROCEDURE ANALYSE(1,24);

+————–+———–+———–+————+————+——————+——-+————————-+——–+——————————-+
| Field_name | Min_value | Max_value | Min_length | Max_length | Empties_or_zeros | Nulls | Avg_value_or_avg_length | Std | Optimal_fieldtype |
+————–+———–+———–+————+————+——————+——-+————————-+——–+——————————-+
| schema.company.id | 20000 | 2000 | 5 | 5 | 0 | 0 | 20000.0000 | 0.0000 | SMALLINT(5) UNSIGNED NOT NULL |
+————–+———–+———–+————+————+——————+——-+————————-+——–+——————————-+
1 row in set (0.00 sec)

As you can see, optimize is now suggesting an Optimal fieldtype of SMALLINT(5) UNSIGNED NOT NULL which has a numeric value range of 0 to 65535 which is adequate for the purposes of our current and future ID allocation in this column.

Planning for the future

One thing to remember is that the data you are seeing today isn’t necessarily representative of the data you’ll get in the future, unless you have a really large set defined by someone else. You’ll have to be careful with the data types you define. Don’t make them unnecessarily large, but you can plan some wiggle room from the PROCEDURE ANALYSE optimal value. Use your mind here.

If you are creating your own data from scratch such as with a form entered on a website, then this technique will be most usefully applied by you after you’ve been up and running for some time and have some real data to analyze. If you are working with a small schema you’ll be able to ALTER your columns to the new data types after the fact.