Speeding up Magento imports (English)

Door Cipriano Groenendal op woensdag, 10 augustus, 2011

In dit artikel

We often receive feedback from customers having  problems with importing products into Magento. Now we have some tricks to speed it up, but usually we just laugh sheepishly and agree, “Yeah, Magento isn’t the fastest when it comes to importing …”, but we don’t really have a good solution for them. And even worse, merely finding the cause of the slow imports has proven to be quite a task. Fortunately, over the past few years, we’ve collected a large number of tools which can help us get to the bottom of this.

Magento imports for Dummies

Most users are familiar with the standard way of importing within Magento. Run an export, download the CSV and fill it with data, it’s not really rocket science. The true problem doesn’t show untill one uploads the CSV file to import it into Magento.

As a test we timed a few imports to get some data. We started with a Magento version 1.4.0.1 and ran an export / import. It was a basic install with some sample data and after a few seconds I had a nice testcase; a CSV file containing 120 products. Importing, however, took a lot longer. About 5 minutes later, all data was imported and the required indexes had been refreshed. 120 products in 5 minutes turns out to be about 2.5 seconds per product. Of course, this seemed a wee bit slow (understatement). My first thought was to check the settings, thinking I’d forgotten something obvious, but all was fine and caching was enabled. To verify the results I ran a second test, but unfortunately it did not show a speed improvement.

The source of the problem

The first thing we look at in such a case is, “What the hell is Magento doing …”. Unfortunately I didn’t have any PHP debugger around, but.. we do have strace. A super fast tool that shows exactly which system-calls a program makes. A few minutes later I ran a 2nd import with strace running on the server and the first thing I noticed was that there was a lot of traffic to the database. That was to be expected, as we’re importing data. The real shocker came after adding a counter. When importing a single simple product Magento performs up to 450 database queries. Now it’s been a while since I designed any application in MySQL and PHP, but I don’t recall ever needing 450 queries to import a single product.

Luckily strace saves the queries for us, so a simple manual scan cleared up several things. Magento uses the EAV Model for their database. In itself a good model for products with many potential attributes, but unfortunately, it’s hard to efficiently store it in an SQL database, which showed when we ran our import. Each row in the CSV file consists of about a 100 columns, and for each cell Magento has to run a seperate query. Add to that cleanup scripts, post-import re-indexing and transaction overhead, and you quickly end up at those 450 queries. Wow!

Magento imports for experts

Luckily, more and more tools are becoming available to optimize this process. It is rapidly becoming clear to the Magento community that Magento’s biggest drawback is their very extensive data model and heavy PHP API surrounding it. Unfortunately they have to always make use of their own API everything they do, they can’t get caught cutting corners in their own system.
The real world situation is quite different. In the real world you can choose to go for speed where necessary, and for imports that increasingly seems to be the choice made. For example, we received a good tip from a partner of ours a few weeks ago.

Introducing Magmi (MAGento Mass Importer)

Riksjasoft was having problems importing products on a decent schedule. The speed was so low that in some cases their imports used the full capacity of a single server for more than 7 hours, just to import a few thousand products. We joined forces for some debugging, but ultimately we weren’t able to solve the problem within Magento itself. Fortunately, they themselves came up with the solution: Magmi. This MAGento Mass Importer intentionally does not use the Magento PHP API but communicates directly with Magento’s SQL database. This allows a massive speed gain in comparison with the standard Magento Importer PHP API. Furthermore, it uses some clever tricks to run even faster. For example, it checks to see what has changed and updates only the changed attributes, instead of just overwriting everything. Since most day to day imports just sync a few changes this means a huge performance boost.

Of course, I also quickly installed and tested Magmi, and the results are quite impressive. The same import I used before (120 products) now costs only 5 seconds instead of 5 minutes and Wim from Rickshaw Soft had similar results to report:

“Where the default Magento cronjob imported about 5 to 6 products per minute on average, Magmi did around 400 products per minute. In addition to that, the database load was a lot lower, reducing the BPU’s used.

We also did a test import of 22,000 products on an in-house-server. The import itself took 4.5 minutes (5000 products per minute!) and the reindexing 6.5 minutes for a total time of 11 minutes for 22,000 products …

So what’s the difference between Magmi and this import method? Magmi initially reads all products and then indexes them. Because the import process Magento normally uses indexes all the time, the default method is so slow.”

That almost sounds too good to be true, wouldn’t you say. Unfortunately, yes, there are some downsides to this script. For example, the developers themselves still label Magmi as beta software, and point out that it isn’t exactly production ready. They also do not support all product types, as only the simple products work reliably for now, and the rest is still being developed. Still, for many shops this shouldn’t be a problem and Magmi could be the product of choice (obviously after doing some thorough testing).

Worth implementing?

Magmi is definitely interesting to test. However, it’s still beta software. We ran two simple tests in a test environment at Byte, but that doesn’t prove a lot. As such, we cannot support Magni if should anything go horribly wrong with your database. So think about the follwoing things before trying this at home:

  • Always test on a separate test environment. 
  • Run tests after importing to make sure EVERYTHING still works. From browsing to buying, purchase to payment, and reviews to returns, anything that has to do with your products needs to be tested.
  • Write down what’s you’ll be testing, and what you’ve changed. This makes it easy to create a checklist and make sure you don’t forget anything.
  • Make sure that before you go live with the import script, you have a recent backup of both the shop and the database. Byte provides this for customers, but it doesn’t hurt to make a copy yourself. Reclaiming a backup on Saturday night at 04:00 am may take a while and having a backup 5 minutes before your change is sometimes more useful than one made a few hours ago.
  • After going live: test, test, test, and keep an eye on your log files for any errors.

Partners en imports

As importing is a hot issue for Magento partners, Partners of Byte are also developing some interesting stuff. For instance Made by Mouses created an import tool for some of their customers and Jos Spyker aswell created a very advanced, production ready module. Details aren’t yet clear but as soon as we get our hands on them we’ll give them a test run (and update this blog) for sure.

Experience with Magmi? Please let us know by posting a comment on the blog. Even if you’ve used other software or have good tips for export / import acceleration, we’d love to hear them.

Werden er via de oude Magento import via de cronjob gemiddeld 5 à 6 producten per minuut ingelezen, via Magmi is dit ongeveer 400 producten per minuut.
Daarnaast is de database load veel en veel lager, waardoor het aantal BPU’s veel lager uitvalt dan eerst.

Op een lokale omgeving hebben we ook een testimport gedaan van 22.000 producten. De import zelf duurde 4,5 minuut (5000 producten per minuut !!), de herindexering daarna, 6,5 minuten, dus een totaaltijd van 11 minuten voor 22000 producten…

Wat is het verschil tussen Magmi en deze import methode? Bij de Magmi import worden eerst alle producten ingelezen en aan het einde vindt een soort indexering plaats. Doordat die indexering tijdens de normale import procedure van Magento plaatsvindt, is deze methode zo traag.

Hi! Mijn naam is Dion, Account Manager at Hypernode

Wil je meer weten over Hypernode's Managed E-commerce Hosting? Plan je online meeting.

plan een een-op-een meeting tel:+31648362102

Visit Hypernode at