Setting BULK_INSERT_BUFFER_SIZE for loading data faster on MyISAM


How fast you can load data to MyISAM ? You can't answer this question with some number but are there techniques to load fast ? Absolutely ! It's always we need to consider simple things to load data faster, The following are quick wins to load data faster on MyISAM 


  • if doing bulk inset, consider disable the indexes before loading the data
                   alter table table1 disable keys;


  • The BULK_INSERT_BUFFER_SIZE variable limits the size of the cache tree in bytes per thread. If data is being added to a non-empty table, tuning the bulk_insert_buffer_size variable can make data insertion faster. Normally  it shows the improvement when the data to be inserted is more than 10k rows. But it is hard to say what is the right value, so, trail and try with incremental buffer size values.Setting it to 0 disables this optimization. The default value is 8MB. The maximum value is 4GB. The bulk_insert_buffer_size can be set both on system and session levels.
  • Consider setting optimally  MYISAM_SORT_BUFFER_SIZE and KEY_BUFFER_SIZE

Things to worry about while considering optimal MySQL performance


  • Optimal of structure of table is very important, Especially the columns must have the right data types and also columns are so well built to handle the data flow inside MySQL. Typically tables with large UPDATES must be conservative about adding columns and tables with large SELECTS will have more columns (hopping the tables for data is expensive)
  • Optimal indexing is good but over indexing is long-term problem to solve 
  • What is your locking strategy ? Shared access guarantees the smooth operations of MySQL and use exclusive only when you have critical transactions under top priority. InnoDB storage engine handles most locking issues without involvement from you, allowing for better concurrency in the database and reducing the amount of experimentation and tuning for your code.
  • Optimally use available memory for caching schema and SQL, Oversized memory handling causes excessive paging. 
  • Distribute data across multiple disks to optimize disk seek time 
  • The modern disks with higher throughput rate can do magic and  is easier to optimize than seeks because you can read in parallel from multiple disks.
  • When you have most often accessed data optimally placed in memory you can get maximum results from CPU investments, It's often main memory bandwidth becomes a bottleneck when the CPU needs more data than can fit in the CPU cache

Which my.cnf file is being used ?

To find which my.cnf used you can use - 

 /usr/sbin/mysqld --verbose --help | grep -A 1 "Default options"


[[email protected] yum.repos.d]# /usr/sbin/mysqld --verbose --help | grep -A 1 "Default options"
2017-08-17 12:22:50 0 [Note] /usr/sbin/mysqld (mysqld 5.6.37) starting as process 5418 ...
2017-08-17 12:22:50 5418 [Note] Plugin 'FEDERATED' is disabled.
Default options are read from the following files in the given order:
/etc/my.cnf /etc/mysql/my.cnf /usr/etc/my.cnf ~/.my.cnf 
2017-08-17 12:22:50 5418 [Note] Binlog end
2017-08-17 12:22:50 5418 [Note] Shutting down plugin 'MyISAM'
2017-08-17 12:22:50 5418 [Note] Shutting down plugin 'CSV'
[[email protected] yum.repos.d]# 


Why MySQL My Favourite ?

I always get this question, Why MySQL Shiv ? So thought will write this Blog,  Yes, I am playing around Database Systems for around 20 years (Yes, From very early days of my college). I have worked across several Database Systems like Oracle, PostgreSQL, DB2, SQL Server and SYBASE, They are all great to build Optimal Enterprise Class Applications for sure !!! Now why I choose MySQL? There are several reasons, I tried to capture few of them below - 
  1. MySQL for High Performance Web Applications MySQL is built for Optimal Web Applications Performance and It's getting more faster with improvements in Optimizer and InnoDB storage engine efficiency. To know more about MySQL Performance Benchmarking results please visit, http://www.mysql.com/why-mysql/benchmarks/
  2. MySQL is Easy To Learn MySQL documentation is so well maintained and detail, The strong community is very dynamic and open to help you /  share the knowledge, It's much easy setup and manage MySQL compared to many other complex Open / Proprietary RDBMS 
  3. MySQL Support SQL MySQL support standard SQL (Structured Query Language), SQL is the standard language of Data Definition Design and Manipulation in RDBMS world.
  4. Multi-Platform Support  MySQL runs optimally on almost all variants of Linux and other Operating Systems like Windows and Mac. 
  5. Open / Free MySQL is an Open Source Project available under terms of GNU General Public License (GPL)
  6. Freedom Of Choice There are several Forks and Branches available for MySQL like Percona Server with XtraDB (Drop-in Replacement for MySQL & InnoDB ) , MariaDB (Better MySQL with default storage engine XtraDB), Facebook MySQL, WebScaleSQL (MySQL Branch from engineers of Facebook, Alibaba etc.)

The Commandments for Optimal MySQL Performance

What we have to keep in mind for an Optimal MySQL Performance ? There are many, I am just writing down what I seriously consider for an Optimal and Scalable MySQL Performance, As you usual the comments are most welcome, Thank you !! 
  1. Go generous while buying physical memory, InnoDB works great when files are accessed from memory rather than from disk. 
  2. Lean MySQL Schema and SQL works great. Trust me, The fat is not Optimal everywhere !!
  3. Carefully choose your Datatype, The smaller ones are better. 
  4. You need Indexes to search records faster but Index overuse / abuse will have negative impact to performance, Remove unused indexes.  
  5. Occasionally MySQL go wrong in choosing Optimal Indexes, Here help MySQL using hints like "USE INDEX".
  6. Think seriously about indexing columns in WHERE, GROUP BY, and ORDER BY queries. 
  7. Less Rows Examined is great !!!
  8. Reduce disk reads, Compress TEXT, BLOB 
  9. Make incremental Schema changes sometimes small changes cause severe Performance improvements / degradation. 
  10. Lookout for duplicate records they seriously cause Performance Bottlenecks.
  11. Archive the old records, They make Sort / Search especially very expensive.
  12. Profile your MySQL regularly to define the trending chart of Performance, Nothing goes wrong overnight, Performance degradation happens in several cases over a period of time so think proactive, That's cool :) 
  13. The fat multi-core processors hyper-threading enabled works great for high performance MySQL operations  
  14. Watch out for excessive pagination queries.
  15. Go conservative about MySQL Configs (sometimes less is more).
  16. Be cautious with MySQL configuration parameters innodb_thread_concurrency and thread_concurrency variables, Don't use them if you don't clearly understand how they work !!
  17. High max_connections excessively consumes the RAM. 
  18. Use query cache only if you have repetitive queries, Using query cache on data that changes often will give you a performance hit. 
  19. Remember to warm your MySQL Database in every restart, This get schema, SQL back to memory. 
  20. Fetch only the data you need in SELECT queries 
  21. Consider persistent connection against multiple connections 
  22. Sort_buffer_size  eat memory very fast because it is actually consumed per connection so please be very cautious while increasing it.
  23. Use large redo logs
  24. Measure cost of query with Response Time, Use MySQL slow query log to diagnose query performance.
  25. I use SHOW PROCESSLIST while load increases, This works good for me during diagnostics / forensics of slow queries.
  26. Use Performance Schema wisely, This is an great MySQL performance orchestration platform.    
  27. Avoid using Triggers as much as possible. 
  28. Use ARCHIVE Storage engine for audit purposes. 
  29. You need Backups but do that on a secondary Replication Instance, This actually is a quick win to boost the Performance of your MySQL Primary Master Instance. 
  30. Split reads across several instances / disk, This helps distributing load efficiently