Latest Updates: MySQL RSS

  • Articles

    KohanaPHP and Memcached

    Stii 3:56 pm on May 25, 2009 | Comments: 4 Permalink | Reply
    Tags: caching, database, , memcached, MySQL, optimization, sql

    Kohana PHPThis is shockingly simple… Almost so simple that I’m waiting for the problems, but after five days there has been none so far!

    When you develop a web application, at some point SQL query optimization just don’t cut it anymore. At Gatorpeeps, you will see we show each user’s latest blog post underneath each of his updates. The problem with this is that on any given page, we’ll run a query per update to get the user’s latest blog post. This is less than ideal, since it is repeating the same queries for every single visitor to a Gatorpeeps page.

    What we want to do is to cache the repetitive query for a certain amount of time, so that instead of querying the database, it fetches it from the cache, dramatically reducing database load and improving speed.

    With the Kohana PHP framework you have a number of options to your disposal namely:

    • File – File cache is fast and reliable, but requires many filesystem lookups.
    • SQlite – Database cache can be used to cache items remotely, but is slower.
    • Memcache – Memcache is very high performance, but prevents cache tags from being used.
    • APC – Alternative Php Cache
    • Eaccelerator
    • Xcache

    We opted for Memcached since it is fast, stable (so far at least!) and extremely simple to get going. We installed and configured it on our server (you should find lots of great guides for doing this if you search for it). We made sure the daemon is running then configured Kohana to use it. Very simple:

    Open the cache.php file located in the system/config directory. Change it to:

    $config['default'] = array
    (
    	'driver'   => 'memcache',
    	'params'   => '',
    	'lifetime' => 3600,
    	'requests' => 1000
    );
    

    If you run Memcached on the same server as your PHP site, you can skip ahead and implement it. If you run it on a different server, you need to change the cache_memcache.php file also located in system/config. If you have more than one memcached server, you can add them to the $config['servers'] array.

    $config['servers'] = array
    (
    	array
    	(
    		'host' => '127.0.0.1',
    		'port' => 11211,
    		'persistent' => FALSE,
    	)
    );
    
    /**
     * Enable cache data compression.
     */
    $config['compression'] = FALSE;
    

    In your source code, you can now use the cache. We have a static function in a helper method to fetch the latest blog post of a user. We get a cache instance and we check if the entry exists in the cache. If it does, we return the result otherwise we add it to the cache with a lifetime of 10 minutes.

    <?php
    ... truncated ...
    public static function getUserLatestBlogPost($user_id)
    {
        $cache = Cache::instance();
    
        // Get the latest blog post from the cache by looking for an unique id like latest_post_123
        $post = $cache->get('latest_post_' . $user_id);
        if ($post) {
            // Found an entry in the cache. Return it.
            return $post;
        } else {
            // There is nothing in the cache for this user, so we need to fetch it from the database and
            // add it to the cache.
            $newpost = self::sqlUserLatestBlogPost($user_id);
    
            // Add it to the cache, changing the default lifetime from 1 hour (3600 seconds) to 10 minutes (600 seconds)
            $cache->set('latest_post_' . $user_id, $newpost, NULL, 600);
            return $newpost;
        }
    }
    ... truncated ...
    

    That’s it. It is really that simple! :) Now, you would probably want to write a method that deletes an old entry if a new entry is inserted into the database and reload the new blog post into the cache.

    REMEMBER: Use caching wisely! It is great, but as they say, too much of a good thing is bad. First make sure your SQL queries are properly optimized before you cache it. Make sure the data you cache is okay to be cached. What I mean by that is it would be senseless to cache data that is very dynamic and changes often. Caching everything WILL have an undesirable effect ;)

    Welcome back! You should subscribe to my RSS feed here.
    You should follow me on Twitter here
    You should follow me on Gatorpeeps here.

    Save Cape Town City Ballet
     
  • Articles

    Installing Sphinx Search

    Stii 3:50 pm on November 17, 2008 | Comments: 1 Permalink | Reply
    Tags: , MySQL,

    Sphinx SearchNow that you have been introduced to Sphinx Search, its time to set it up or install it. It is extremely simple. Just have a look at the guide for Debian, to give you an idea:

    sudo apt-get update
    sudo apt-get dist-upgrade
    sudo apt-get install build-essential
    sudo apt-get install libmysqlclient15-dev
    
    tar xvzf sphinx-0.9.8-rc2.tar.gz
    cd sphinx-0.9.8-rc2/
    ./configure \
    --with-mysql-includes=/usr/include/mysql \
    --with-mysql-libs=/usr/lib/mysql
    make
    sudo make install
    

    Below are two of the best guides I could find. One for installing on Debian and the other on CentOS.

    Next time, we’ll look at configuring it to actually work. Also, how to run some queries and how the results from Sphinx Search looks.

    Save Cape Town City Ballet
     
  • Articles

    Sphinx Search introduction

    Stii 10:24 am on November 17, 2008 | Comments: 2 Permalink | Reply
    Tags: , MySQL,

    Sphinx SearchFull text search is usually not a nicety, but often a necessity. To search a database text field by simply doing a normal SQL query could often result in a slow query, which is NOT GOOD. Now, you might not always have a need for full text search, specially if you have a smallish data set, but if you get to a certain threshold you’ll hit serious performance problems.

    That is exactly what happened on Afrigator. We started out using normal, simple SQL SELECT queries.

    SELECT * FROM posts
    WHERE title LIKE '%keywords%'
    OR body LIKE '%keywords%'
    

    This was doing just great in the beginning. Nowadays our blog post table is close to a Gig in size and as you can imagine, that is a lot of words to try and filter to find specific terms! Funny thing is, a Gig is still tiny in web terms. What is a fact though is that it will only grow and with it, so would our performance issues!

    Thank goodness for Sphinx Search! Sphinx Search is an open source (GPL2 License) full text search engine. It is FAST. Not only is search fast, but so also is the text indexing.

    According to the Sphinx Search site, here is a list of features:

    • high indexing speed (upto 10 MB/sec on modern CPUs);
    • high search speed (avg query is under 0.1 sec on 2-4 GB text collections);
    • high scalability (upto 100 GB of text, upto 100 M documents on a single CPU);
    • provides good relevance ranking through combination of phrase proximity ranking and statistical (BM25) ranking;
    • provides distributed searching capabilities;
    • provides document exceprts generation;
    • provides searching from within MySQL through pluggable storage engine;
    • supports boolean, phrase, and word proximity queries;
    • supports multiple full-text fields per document (upto 32 by default);
    • supports multiple additional attributes per document (ie. groups, timestamps, etc);
    • supports stopwords;
    • supports both single-byte encodings and UTF-8;
    • supports English stemming, Russian stemming, and Soundex for morphology;
    • supports MySQL natively (MyISAM and InnoDB tables are both supported);
    • supports PostgreSQL natively.

    As you can see, that’s quite a nifty feature set! When we implemented it on Afrigator, we chose to run Sphinx Search as a daemon, though you could compile MySQL with Sphinx support should you choose.

    It was also SUPER SIMPLE to set it up! All you would have to have to get it running for your own application is to be able to compile C++ on your web server. I.e. you’d have to have a shell account and capabilities to compile using gcc on Linux.

    One tiny hiccup I experienced was that I needed to have the libmysqlclient-dev package installed (if you haven’t already) in order for Sphinx to compile. They don’t mention that in the documentation, so just be aware of that. Other than that, you simply have to follow the steps in the documentation to get it up and running.

    Once compiled successfully, you have to set it up and configure it. More on that at a later stage. For now, if you are developing web apps that is heavily dependent on full text search, I can recommend Sphinx Search as a viable solution.

    Save Cape Town City Ballet
     

About Me

Software developer at Afrigator.com Love Python, do PHP.
c
compose new post
j
next post/next comment
k
previous post/previous comment
r
reply
e
edit
o
show/hide comments
t
go to top
esc
cancel