HMV.co.in

September 27, 2008

PHP mysql dump

Filed under: mobile, news — Tags: , , — Harsha M V @ 3:42 pm

MySQLDump class is now at version 2.0 and this article contains obsolete informations about the version 1.0. For information about the new version check this article

I searched a way to do a daily backup of my wordpress blog in my home pc but I didn’t find nothing that completely satisfy me, so I decided to code it.
I separated the problem in two scripts:
The first, that is written in php and runs on the web server, makes the dump of mysql databases from my hosting provider and leaves the dump on the web server.
The other one (written in bash, and running locally) downloads the site and the mysql dump from the web server, via FTP. Unfortunately I discovered that my hosting provider (Aruba.it) didn’t supply with mysqldump its server.

Lamp

First of all, I searched for a php script that makes the direct dump of the database.

I found a class that does the dump of mysql database directly, but it haves some bugs:

  1. stores all dump in one variable, so if the db is very large Apache will be angry;
  2. makes a bad dump of primary keys and indexes;
  3. doesn’t write binary fields in hexadecimal format;
  4. hasn’t no way to dump the db structure or data.

In my free time I corrected these bugs, and now the class is ready to make the full backup of a mysql database, and can compress the dump in gzip format!
This example shows how to use the class.

<?php

//if t=1 dumps the data, otherwise the structure

$data=$_GET['t'];

require(”class_mysqldump.php”);

//Instantiate the class: host name, user name, and password

$dump = new MySQLDump(”localhost”“root”“”);

//If you want to compress the output uncomment the follow line

//$dump = new MySQLDump(”localhost”, “root”, “”, False);

if ($data==”1″) {

    $dump->dumpDatabaseData(”dbname”$filename100);

    //If you don’t want binary fields saved in hexadecimal

    //format uncomment the follow line

    //$dump->dumpDatabaseData(”nomedb”, $filename, 100, False);

}

else {

    //dump the structure

    $dump->dumpDatabaseStructure(”nomedb”$filename);

}

//send file to standard output

header (’Content-Typeapplication/octet-stream’);

header(’Content-Dispositionattachmentfilename=”‘.$filename.’”‘);

$file=fopen($filename,”r”);

fpassthru($file);

fclose($file);

//delete temporary files

unlink($filename);

?>

Download MySQLDump. Downloads: 2703

September 23, 2008

php mysql dump

Filed under: MySQL, php — Tags: , , , , — Harsha M V @ 7:53 am

MySQLDump class is now at version 2.0 and this article contains obsolete informations about the version 1.0. For information about the new version check this article

I searched a way to do a daily backup of my wordpress blog in my home pc but I didn’t find nothing that completely satisfy me, so I decided to code it.
I separated the problem in two scripts:
The first, that is written in php and runs on the web server, makes the dump of mysql databases from my hosting provider and leaves the dump on the web server.
The other one (written in bash, and running locally) downloads the site and the mysql dump from the web server, via FTP. Unfortunately I discovered that my hosting provider (Aruba.it) didn’t supply with mysqldump its server.

Lamp

First of all, I searched for a php script that makes the direct dump of the database.

I found a class that does the dump of mysql database directly, but it haves some bugs:

  1. stores all dump in one variable, so if the db is very large Apache will be angry;
  2. makes a bad dump of primary keys and indexes;
  3. doesn’t write binary fields in hexadecimal format;
  4. hasn’t no way to dump the db structure or data.

In my free time I corrected these bugs, and now the class is ready to make the full backup of a mysql database, and can compress the dump in gzip format!
This example shows how to use the class.


<?php

//if t=1 dumps the data, otherwise the structure

$data=$_GET['t'];

require(”class_mysqldump.php”);

//Instantiate the class: host name, user name, and password

$dump = new MySQLDump(”localhost”“root”“”);

//If you want to compress the output uncomment the follow line

//$dump = new MySQLDump(”localhost”, “root”, “”, False);

if ($data==”1″) {

    $dump->dumpDatabaseData(”dbname”$filename100);

    //If you don’t want binary fields saved in hexadecimal

    //format uncomment the follow line

    //$dump->dumpDatabaseData(”nomedb”, $filename, 100, False);

}

else {

    //dump the structure

    $dump->dumpDatabaseStructure(”nomedb”$filename);

}

//send file to standard output

header (’Content-Typeapplication/octet-stream’);

header(’Content-Dispositionattachmentfilename=”‘.$filename.’”‘);

$file=fopen($filename,”r”);

fpassthru($file);

fclose($file);

//delete temporary files

unlink($filename);

?>

Download MySQLDump. Downloads: 2703

20 PHP optimization tips – make it faster

Filed under: MySQL, php — Tags: , , , — Harsha M V @ 7:50 am

Optimizing PHP is of a great importance, though it has not much impact in a small script, however the bigger the script is becoming some shortcomings can cause much inconvenience later. Here are several facts (although some of them could be disputed about) that may help you in choosing the the right way to achieve the best results.

1.

echo is faster than print, despite of that both are constructors, print is returning boolean result, echo returns nothing.
2.

printf is much more slower than echo or print, it must handle additional parameters
3.

avoid as much as possible magic methods like __get, __set, __autoload
4.

when searching string in another string strstr() is faster than preg functions, however if you want to know if that string exists use strpos().
5.

++$i works faster than $i++. This is PHP unique behavior.
6.

echo ‘Hello world’ works faster than echo “Hello world”, although speed difference is very tiny.
7.

$arr[‘id’] works much more faster than $arr[id]
8.

if you need to print a bigger piece of text without PHP variables just exit PHP (?>) and enter it when you need it again (<?php).
9.

ctype functions are faster than regular expressions. More info at http://php.net/ctype
10.

errors suppression with ‘@’ is useful, but makes script load slower.
11.

use full OS paths, /home/user/public_html/images/ instead of just images/
12.

don’t use variables if they will be used just 1 time.
13.

if you want to assign variable to one of the array indexes, don’t do $var=$arr[‘foo’], better choice would be use references. $var=&$arr[‘foo’], now you just creating a reference to $arr[‘foo’] instead of duplicating it’s value.
14.

use $_SERVER[’REQUEST_TIME’] instead of every time calling time() function. For example declare constant at the config file: define(“TIME”, $_SERVER[’REQUEST_TIME’]) and use TIME when you want to know timestamp.
15.

isset($arr[‘foo‘]) works faster than in_array(‘foo’, $arr)
16.

require_once() is useful, but works slower than common require()
17.

never use unwanted functions in loop header, for($i=0; $i<count($foo); $i++) – will count $foo on every loop step
18.

don’t use database queries in loops. Every major database system has enough possibilities to handle out that situation
19.

mysql_fetch_array() will return an array with both index types, mysql_fetch_assoc() will not return numerical indexed array, mysql_fetch_row() will return just numerical indexed array
20.

You must avoid several writing in one file. Better choice is to collect all data and with one writing session update file.

To conclude, it not always possible to follow these ways of optimizing the PHP. The main point of optimizing is that if you include as much optimizing tips in your script, the shorter the execution time of it will be and vice verse – the longer the script with only minimal optimization tips will prolong its execution time.

5 useful PHP functions for MySQL data fetching

abstract
PHP usually comes with mysql extension mysql (mysql_… functions) which is rather low-level and requires writing too much code for a trivial tasks: error-checked execution of SQL queries, getting single row from SQL table, getting value from SQL table cell, etc. OOP mysqli extension isn’t any better. In this article we present 5 useful functions which greatly simplify majority of real-life MySQL operations.
compatible
  • PHP 4
  • PHP 5

1. Function good_query. This function is analogous to mysql_query. It executes query and return result resource — the one, that could be used with standard functions mysql_fetch_row, mysql_fetch_assoc, mysql_num_rows and others, TRUE (for successful UPDATE, DELETE, DROP, etc SQL statements) or FALSE (for failed queries).

Any of good_query… functions has two arguments:

  • $sql – SQL query string
  • $debug – use debug logging (see ‘Debug logging’ below)

Advantages:

  • good_query() has built-in error checking: for failed queries it prints error string and original SQL query text both to standard output and PHP(web-server) error log
  • Debug logging: second argument of good_query() is $debug flag. When $debug == 1, good_query prints(to standard output) every executed query. When $debug == 2, good_query saves every executed query to error log.
  • ‘good_query’ is one letter shorter than ‘mysql_query’ and easier to type :-)
source code: php
<?php
function good_query($string, $debug=0)
{
    if ($debug == 1)
        print $string;

    if ($debug == 2)
        error_log($string);

    $result = mysql_query($string);

    if ($result == false)
    {
        error_log("SQL error: ".mysql_error()."\n\nOriginal query: $string\n");
        // Remove following line from production servers
        die("SQL error: ".mysql_error()."\b<br>\n<br>Original query: $string \n<br>\n<br>");
    }
    return $result;
}
?>

2. Function good_query_list. This function executes (with error checking) query, retrieves first row and returns it as an array.

Advantages:

  • Call only one good_query_list function instead of mysql_query, mysql_fetch_row, mysql_free_result
  • Debug logging: see good_query function

Note: You should probably add ‘LIMIT 1′ to all queries passed to good_query_list (since this function returns only first row)

source code: php
<?php
function good_query_list($sql, $debug=0)
{
    // this function require presence of good_query() function
    $result = good_query($sql, $debug);

    if($lst = mysql_fetch_row($result))
    {
        mysql_free_result($result);
        return $lst;
    }
    mysql_free_result($result);
    return false;
}
?>

Usage example:

source code: php
<?php

// Imagine, you want to fetch specific user information from users table
// You need information only for one user 

@list($login, $email, $settings) = good_query_list("SELECT login, email, settings
                                                      FROM users
                                                     WHERE id={$user_id}");
// we have added @ -- to avoid assignment warning when good_query_list
// did not found any rows and returned false
if ($login == false)
    die("no user {$user_id} found");

// A lot simplier than mysql_query, mysql_fetch_row, mysql_free_result functions
// and if statements

?>

3. Function good_query_assoc. Executes (with error checking) query, retrieves first row and returns it as a hash(associative) array. Very similar to good_query_list().

source code: php
<?php
function good_query_assoc($sql, $debug=0)
{
    // this function require presence of good_query() function
    $result = good_query($sql, $debug);

    if($lst = mysql_fetch_assoc($result))
    {
        mysql_free_result($result);
        return $lst;
    }
    mysql_free_result($result);
    return false;
}
?>

Usage example:

source code: php
<?php

// Imagine, you want to fetch specific user information from users table
// You need information only for one user 

$user = good_query_assoc("SELECT login, email, settings FROM users WHERE id={$user_id}");
if ($user == false)
    die("no user {$user_id} found");

echo "Login: ".$user['login'];
echo "\nEmail: ".$user['email'];
echo "\nSettings: ".$user['settings'];

?>

4. Function good_query_value. Executes (with error checking) query, retrieves first cell (first row and first column) and returns it.

  • Simpler than list($variable) = good_query_list(…) function.

Note: You should probably select only one field and add ‘LIMIT 1′ to all queries passed to good_query_value (since this function returns only first cell)

source code: php
<?php
function good_query_value($sql, $debug=0)
{
    // this function require presence of good_query_list() function
    $lst = good_query_list($sql, $debug);
    return is_array($lst)?$lst[0]:false;
}
?>

Usage example:

source code: php
<?php
// good_query_value is useful for count()-queries

$count = good_query_value("SELECT count(*) FROM table");
echo "$count row(s) in a table";
?>

5. And, finally, good_query_table. This function executes (with error checking) SQL query and retrieves entire result table as two-dimensional array. First dimension is rows(0..row count), second dimension is columns (associative hash).

  • Extremely useful for loading data to array-based templates

Note: Since all table is loaded into memory, you should use this function for small tables or add ‘LIMIT …’.

source code: php
<?php
function good_query_table($sql, $debug=0)
{
    // this function require presence of good_query() function
    $result = good_query($sql, $debug);

    $table = array();
    if (mysql_num_rows($result) > 0)
    {
        $i = 0;
        while($table[$i] = mysql_fetch_assoc($result))
            $i++;
        unset($table[$i]);                                                                                  
    }                                                                                                                                     
    mysql_free_result($result);
    return $table;
}
?>

Usage example (draw table as HTML):

source code: php
<?php
$table = good_query_table("SELECT * FROM table");

echo "<table border='1'>";
foreach($table as $row)
{
    echo "<tr>";
    foreach($row as $column=>$cell)
    {
        echo "<td>".$cell."</td>";
    }
    echo "</tr>"
}
echo "</table>";
?>

Another example (rendering blog data as HTML):

source code: php/html
<?php
// fetching first 20 records from simple blog table
// id - comments - text - posted_at
$blog = good_query_table("SELECT id, comments, text, posted_at FROM blog LIMIT 0, 20");

// output template:
foreach($blog as $post)
{
?>
    <div class="post">
        <p><?=$post['text']?></p>
        <div class="info">
            posted at <?=$post['posted_at']?>,
            <a href="./comment-url?id=<?=$post['id']?>"><?=$post['comments']?> comment(s)</a>
        </div>
    </div>
<?
}

?>

Some usage tips:

  • You may add time measure to good_query function, to find out how long you SQL queries are executed
  • Create trivial functions good_row, good_assoc, good_num, good_affected which will simple call correspondingly mysql_fetch_row, mysql_fetch_assoc, mysql_num_rows, mysql_affected_rows. They are much easier to type.
  • Since architecture of some other PHP database extensions (like pgsql for PostgreSQL) is very close to mysql, you could easily switch you application to use another DB server by rewriting good_… functions.

Download source of all ‘good_’ MySQL functions in one file: ae_good_mysql.txt (rename to .php after downloading). There are many ‘good’ functions there.

You may send feedbacks on this article.

warning
  • For production servers it’s better to remove die(“SQL error:”… output from function good_query(). Site visitors may be confused with SQL code, and hackers could use it to get knowledge of your database.
tested by AnyExample.com on 2007-03-21
  • FreeBSD 6.2 :: PHP 5.2.1
  • RHEL 5 :: PHP 4.4.5

September 22, 2008

Optimizing MySQL: Queries and Indexes

Filed under: MySQL — Tags: , , , — Harsha M V @ 6:25 pm

You know the scene. The database is just too slow. Queries are queuing up, backlogs growing, users being refused connection. Management is ready to spend millions on “upgrading” to some other system, when the problem is really that MySQL is simply not being used properly. Badly defined or non-existent indexes are one of the primary reasons for poor performance, and fixing these can often lead to phenomenal improvements. Consider an extreme example:


CREATE TABLE employee (
   employee_number char(10) NOT NULL,
   firstname varchar(40),
   surname varchar(40),
   address text,
   tel_no varchar(25),
   salary int(11),
   overtime_rate int(10) NOT NULL
);

To find employee Fred Jone’s salary(employee number 101832), you run: SELECT salary FROM employee WHERE employee_number = '101832';

MySQL has no clue where to find this record. It doesn’t even know that if it does find one matching, that there will not be another matching one, so it has to look through the entire table, potentially thousands of records, to find Fred’s details.

An index is a separate file that is sorted, and contains only the field/s you’re interested in sorting on. If you create an index on employee_number, MySQL can find the corresponding record very quickly (Indexes work in very similar ways to an index in a book. Imagine paging through a technical book (or more often, an scrambled pile of notes!) looking for the topic “Optimizing MySQL”. An index saves you an immense amount of time!

Before we repair the table structure above, let me tell you about a most important little secret for anyone serious about optimizing their queries: EXPLAIN. EXPLAIN shows (explains!) how your queries are being used. By putting it before a SELECT, you can see whether indexes are being used properly, and what kind of join is being performed…

For example:


EXPLAIN SELECT employee_number,firstname,surname FROM employee WHERE employee_number= '10875';
+----------+------+---------------+------+---------+------+------+------------+
| table    | type | possible_keys | key  | key_len | ref  | rows | Extra      |
+----------+------+---------------+------+---------+------+------+------------+
| employee | ALL  | NULL          | NULL |    NULL | NULL |    2 | where used |
+----------+------+---------------+------+---------+------+------+------------+

So what are all these things?

  • table shows us which table the output is about (for when you join many tables in the query)
  • type is an important one – it tells us which type of join is being used. From best to worst the types are: system, const, eq_ref, ref, range, index, all
  • possible_keys Shows which possible indexes apply to this table
  • key And which one is actually used
  • key_len give us the length of the key used. The shorter that better.
  • ref Tells us which column, or a constant, is used
  • rows Number of rows mysql believes it must examine to get the data
  • extra Extra info – the bad ones to see here are “using temporary” and “using filesort”

Looks like our query is a shocker, the worst of the worst! There are no possible keys to use, so MySQL has to go through all the records (only 2 in this example, but imagine a really large table).

Now lets add the index we talked about earlier.

If we re-run the EXPLAIN, we get:

+----------+-------+---------------+---------+---------+-------+------+-------+
| table    | type  | possible_keys | key     | key_len | ref   | rows | Extra |
+----------+-------+---------------+---------+---------+-------+------+-------+
| employee | const | PRIMARY       | PRIMARY |      10 | const |    1 |       |
+----------+-------+---------------+---------+---------+-------+------+-------+

The query above is a good one (it almost falls into the category of “couldn’t be better”). The type of “join” (not really a join in the case of this simple query) is “const”, which means that the table has only one matching row. The primary key is being used to find this particular record, and the number of rows MySQL thinks it needs to examine to find this record is 1. All of which means MySQL could have run this query thousands of times in the time it took you to read this little explanation.

Some knowledge of how indexes work allows you to use them more efficiently. Firstly, note that when you update a table with an index, you have to update the index as well, so there is a performance price to pay. But unless your system runs many more inserts than selects and the inserts need to be quick, and not the selects, this is a price worth paying.

What about if you want to select on more than one criteria? (As you can see, it only makes sense to index those fields you use in the WHERE clause.) The query:

SELECT firstname FROM employee;

makes no use of an index at all. An index on firstname is useless. But,

SELECT firstname FROM employee WHERE surname="Madida";

would benefit from an index on surname.

Let’s look at some more complex examples where EXPLAIN can help us improve the query. We want to find all the employees where half their overtime rate is less than $20. Knowing what you do, you correctly decide to add an index on overtime_rate, seeing as that’s the column in the where clause.

ALTER TABLE employee ADD INDEX(overtime_rate);

Now let’s run the query.

EXPLAIN SELECT firstname FROM employee WHERE overtime_rate/2<20;

+----------+------+---------------+------+---------+------+------+------------+
| table    | type | possible_keys | key  | key_len | ref  | rows | Extra      |
+----------+------+---------------+------+---------+------+------+------------+
| employee | ALL  | NULL          | NULL |    NULL | NULL |    2 | where used |
+----------+------+---------------+------+---------+------+------+------------+

Not good at all! Every single employee record is being read. Why is this? The answer lies in the “overtime_rate/2″ part of the query. Every overtime_rate (and hence every record) has to be read in order to divide it by 2. So we should try and leave the indexed field alone, and not perform any calculations on it. How is this possible? This is where your school algebra comes to the rescue! You know that ‘x/2 = y’ is the same as ‘x = y*2′.We can rewrite this query, by seeing if the overtime_rate is less than 20*2. Let’s see what happens.

EXPLAIN SELECT firstname FROM employee WHERE overtime_rate<20*2;

+--------+-------+---------------+---------------+---------+------+------+----------+
|table   | type  | possible_keys | key           | key_len | ref  | rows |Extra     |
+--------+-------+---------------+---------------+---------+------+------+----------+
|employee| range | overtime_rate | overtime_rate |       4 | NULL |    1 |where used|
+--------+-------+---------------+---------------+---------+------+------+----------+

Much better! MySQL can perform the 20*2 calculation once, and then search the index for this constant. The principle here is to keep your indexed field standing alone in the comparison, so that MySQL can use it to search, and not have to perform calculations on it.

You may say that I was being unfair, and should have phrased the request as “where the overtime rate is less than 40″, but users seem to have a knack of making a request in the worst way possible!

Ordering by surname is a common requirement, so it would make sense to create an index on surname. But in this example our employee table consists of thousands of people from Swaziland, and with the surname “Dlamini”. So we need to index on firstname as well. The good news is that MySQL uses leftmost prefixing, which means that a multi-field index A,B,C will also be used to search not only for a,b,c combinations, but also A,B as well as just A.

In our example, this means that an index of the type

ALTER TABLE employee ADD INDEX(surname,firstname);

is used for a queries such as

EXPLAIN SELECT overtime_rate FROM employee WHERE surname='Madida';

as well as

EXPLAIN SELECT overtime_rate FROM employee WHERE surname='Madida' and firstname="Mpho";

which both result in

+--------+------+-------------------+---------+---------+-------+------+-----------+
|table   | type | possible_keys     | key     | key_len | ref   | rows |Extra      |
+--------+------+-------------------+---------+---------+-------+------+-----------+
|employee| ref  | surname,surname_2 | surname |      41 | const |    1 |where used |
+--------+------+-------------------+---------+---------+-------+------+-----------+

However, the query

EXPLAIN SELECT overtime_rate FROM employee WHERE firstname='Mpho';

does not use an index, as firstname is not available from the left of the index, as shown below.

+----------+------+---------------+------+---------+------+------+------------+
| table    | type | possible_keys | key  | key_len | ref  | rows | Extra      |
+----------+------+---------------+------+---------+------+------+------------+
| employee | ALL  | NULL          | NULL |    NULL | NULL |    3 | where used |
+----------+------+---------------+------+---------+------+------+------------+

If you needed this kind of query, you would have to add a separate index on firstname.

The Query Optimizer, OPTIMIZE and ANALYZE

The magic inside MySQL that decides which keys, if any, to use to in the query, is called the query optimizer. It takes a quick glance at the index to see which indexes are the best to use. Compare it to searching for a CD by the artist “Savuka”, called “Third World Child”, where there are 2 indexes, one alphabetical by artist name, and the other by album name. At a glance, you see that there are 20000 unique artists, and 400000 unique albums, so you decide to search by artist. But if you knew that there were 50 Savuka albums, and that Third World child is the only album starting with “T”, your search criteria would change. You can provide similar information for the Optimizer by running

ANALYZE TABLE tablename;

This stores the key distribution for the table (running ANALYZE is equivalent to running myisamchk -a or myismachk –analyze).

Many deletes and updates leave gaps in the table (especially when you’re using varchar, or in particular text/blob fields). This means there are more unnecessary disk I/O’s, as the head needs to skip over these gaps when reading. Running

OPTIMIZE TABLE tablename

solves this problem. Both of these statements should be run fairly frequently in any well looked after system.

Another factor that most people don’t use when indexing is to take advantage of short indexes. You don’t have to index on the entire field. Our surname and firstname fields are 40 characters each. That means the index we created above is 80 characters. Inserts to this table then also have to write an additional 80 characters, and selects have 80 character blocks to maneuvre around (disk I/O is the primary hardware bottleneck, but that’s for another day!). Try reducing the size of your index – in the example above, rather use.

ALTER TABLE employee ADD INDEX(surname(20),firstname(20));

Now our updates write to an index half the size, and selects have a smaller index to search. Both will be faster (unless you make the indexes too short – imagine a book index, instead of giving the full word, only contained the first letter of the word!. You’d spend a lot of time looking up “semaphore” and “saxophone” when you actually wanted “SQL”. Don’t do the same to MySQL!

The same applies to the original field definitions. In these days of ample disk space, we don’t often worry about space. But smaller usually means faster, so defining our surname and firstname fields as CHAR (255) would be a mistake if the biggest firstname is never more than 20 characters! You don’t want to cut names off, but remember that you can ALTER the field later if conditions change, and you need to allow for more characters. I also suggest using VARCHAR rather than CHAR (variable length characters rather than fixed length characters), even though many don’t recommend this as they are more subject to fragmentation. I overcome this by using OPTIMIZE often.

Most systems need to be highly optimized for selects – take a news site which performs millions of queries per day, but where the data arrives in large batches of text files. So for parts of the day, inserts need to be optimal, without noticeably affecting the millions trying to access the data…

Assuming a nicely formatted ‘|’ delimited text file that we want to insert into the table above, take this piece of PHP code:

if (!($fp = fopen("datafile.txt","r"))) {               // open the file for reading
  print "\nUnable to open datafile.txt for writing";    // display error
  exit();                                       // end the running of the program
}

while (!feof ($fp)) {                   // loop through the file line by line
  $sline = fgets($fp, 4096);            // put the data into the variable $sline
  $sline = chop($sline);                        // remove the newline
  list($eno,$fname,$sname,$telno,$salary) = split("|",$code);
                        // split the line on "|", populating the ind. variables
  $db->query("insert into employee(employee_number,firstname,surname,
tel_no, salary
   values($eno,'$fname','$sname','$tel_no', $salary)");
}                                               // end while loop

This would work, but would be very slow. The index buffer would be flushed after every insert. Until recently, MyISAM tables (The MySQL default) did not allow data to be inserted at the same time as being read. The new format does, but only if there are no deleted records present (highly unlikely in a heavily used system). So the entire table is locked for the duration of each insert. Fine on a low volume site, but when you’re getting hundreds or thousands of queries per second, you’ll soon notice the backlog!

There’s a solution however – the best way to insert the data is to use MySQL’s “LOAD DATA INFILE“. This is much faster (20 times according to MySQL), and the only way some systems I’ve seen are still hanging in there!

The syntax is simple, and the code becomes a lot simpler too:

$db->query("LOAD DATA INFILE 'datafile.txt' INTO TABLE employee (employee_number,firstname,surname,tel_no,salary) FIELDS TERMINATED BY '|'");

LOAD DATA INFILE has defaults of:

FIELDS TERMINATED BY '\t' ENCLOSED BY '' ESCAPED BY '\\'

if you don’t specify any of these clauses. And, just as with an ordinary insert, you need to specify a field list if the order of the fields is different, or, as in the example above, you’re not inserting data for every field. Always specifying a field list is good practice for all queries anyway – if someone adds a field to the table at a later stage, you don’t want to go back and have to fix all your previous INSERT and SELECT * statements.

If you can’t get this to work properly, have a look at the format of your text file – every problem I’ve seen with LOAD DATA has been because of a corrupted text file. Every field in every row must be delimited correctly!

You may not always be inserting from a text file – perhaps your application needs to do many unrelated inserts continually. There are ways to make sure the mass of users selecting are not badly affected… The first is to use INSERT LOW PRIORITY. This waits until there are no more reads waiting to happen, waiting for the gap, and not pushing in as it were. Of course, if your database is a rush hour special, there may never be a gap, and the client performing the INSERT LOW PRIORITY may start to grow cobwebs! An alternative here is INSERT DELAYED. The client is immediately freed, and the insert put into a queue (with all the other INSERT DELAYED’s still waiting for the queue to end). This means that there can be no meaningful information passed back to the client, (such as the auto_increment value), as the INSERT has not been processed when the client is freed. Also, be aware that a catastrophe such as an unexpected power failure here will result in the queued INSERT’s being lost. For neither of these methods do you have any idea when the data will be inserted, if at all, so I suggest you use with caution.

Conclusion

It’s not only getting the data in that needs to be quick – sometimes you need to get it out quickly too. (Say you’ve accidentally loaded yesterday’s classified ads, for example). Don’t do a:

DELETE FROM classifieds;

Rather, use:

TRUNCATE TABLE classifieds;

The difference here is that DELETE drops records one by one, and that can be 1 million one by one’s too slow! Note that this does not apply before version 4.0 of MySQL. At time of writing, most of you will still be using 3.x versions (if you do a DELETE FROM tablename on a non-empty table, and get 0 records back as a result, you’re running an earlier version. To fix this problem, MySQL made DELETE remove records one by one so as to return the number of records deleted, but TRUNCATE still did the quick delete. Also, earlier versions than 3.23.33 used TRUNCATE tablename, not TRUNCATE TABLE tablename)

This has only been a brief introduction to optimizing queries and indexes. Once you’ve mastered these tips, you’ll want to look at your more complex joins that just never seem to perform quickly. The MySQL site has lots of useful information, and there are a few good books out there as well. But don’t forget EXPLAIN! Often the best way is to try and rewrite the query in as many different ways as possible, and see which one runs more efficiently. You can learn a lot by trying to figure out why one alternative ran faster than the other. Good luck!

resource list:
www.mysql.com – complete documentation for MySQL

Automating MySQL Backups with Cron

Filed under: php — Tags: , , , — Harsha M V @ 6:17 pm


Thousands of web applications use MySQL for their databases, and almost all of them store some sort of valuable information. You never know when a hard drive will fail, a server will die, or in the absolute worst case, when someone will find a vulnerability that allows them to delete your databases. Sometimes we just take for granted how well everything just… works. We’re going to look at two simple ways to backup your databases, and how to automate them using crontabs. First, we’re going to set up the command or script that will back up the database(s). There’s a few things you’re going to need to configure before you start backing up, so let’s get to it…

There are a few different ways you can do these backups, so I’ve provided a method to use both PHP (requires php-cli) and a regular Linux shell command with MySQL to do the same thing.

Using the command line

You can use the following command to back up your MySQL database.:

mysqldump -uusername -ppassword dbname | gzip > /home/username/folder/mysql_backup.gz

You’ll have to set the username, password, dbname, and path to save in. Also note, the -u and -p preceding the username and password must be there, or it will not work correctly.

To run it automatically, save your configured command from above into a file named anything you want, with as a shell script (.sh). We’ll talk about automating them in just a minute, but for a little more flexibility, I prefer…

Using a PHP Script

You can also back up your databases using a PHP script. Setting it up takes a little longer, but it is more configurable if you aren’t as proficient with shell scripting. You can download the PHP Class (Based on MySQLDump – written by Daniele Vigan, GPL) To configure the PHP script, you’ll have to do the following:

Set your database settings) All of this information is available in your hosting providers Control Panel, or can be managed easily using phpMyAdmin or a similar MySQL management program. If you’re not sure about the host, you can leave it as localhost most of the time.

/* Database details */
$dbname = “databasename“;
$dbhost = “localhost“;
$dbuser = “user“;
$dbpw = “pass“;

Compress backed up files) If you want to compress the backups that you make, you can set this to true to gzip the files before saving them.

/* Compress as gzip? */
$compression = false;

Dump as hex) Choose if you want to dump the database as hex.

/* Dump in hex? */
$hex = false;

Filename to back up to) Set the filename that you want to save the backup as. You don’t have to put an extension on the end of it. You can also select if you want to add a date and timestamp in the filename. I would recommend this, as it makes keeping them together easy, while still being able to sort find them fairly quickly.

/* Filename (No extension) */
$filename = “filename”;

/* add date/timestamp to filename? */
$filedate = true;

Keep in mind, (mentioned by resmo in the comments) that if your database is bigger than PHP’s default memory limit of 8MB, you’ll have to change your php.ini, or add ini_set ( “memory_limit”, “20M”) to your script to change the limit. Once you’ve set all of those details, I would highly suggest that you test it out a few times, to make sure that it is dumping all of the information correctly and to your preferences, and then we can proceed with automating your MySQL backups.

Automating with cPanel, Plesk, or similar

Now, if you’re using a host that provides you with a program like cPanel or Plesk, you’ll have an area in your control panel for setting up crontabs. Here’s an example of how cPanel handles crontabs:

Adding backup scripts in crontabs

If you’re going to run the command version of the backup, you can paste your configured command into the command field, choose when you want it to run, and click go. If you’re running the PHP version, you must first copy your files into a directory, and then enter php /home/username/folder/script.php for the command. Remember, the PHP version will only work with php-cli installed on the server.

Automating without a cPanel (crontabs)

If your hosting provider does not provide you with a Control Panel, and you have shell access, you can set crontabs using the command line. (using crontabs)

Should you choose to go the PHP route, I would advise that you make sure nobody but that machine can access that folder, so that nobody can download your backups. (covered briefly here).

Now you can be certain that your MySQL databases are backed up properly – as often as you’d like. How do you keep your databases backed up? How often have do you usually back up your databases? Hopefully you’ll never have to go through the pain of losing a whole database as I have in the past. Be safe, be smart, keep regular backups in multiple locations.

Blog at WordPress.com.