Diff: memory exhausted

From lxadm | Linux administration tips, tutorials, HOWTOs and articles
Jump to: navigation, search

I've tried to make a diff of two ~800 MB big files on a machine with 1 GB RAM – unfortunately, it didn’t work:

# diff -u full/2007-09-25-full.sql incr/incr.sql
diff: memory exhausted

After some googling and looking through various blogs and articles, it appeared to me that GNU diff just works like that – wants to load both files in memory. So, perhaps adding some more space would help?

# dd if=/dev/zero of=/swapfile bs=1024 count=5242800
# mkswap /swapfile
# swapon /swapfile

Indeed, it helped – I was able to make a diff of these two ~800 MB files. However, as soon as I tried to make a diff of ~1 GB files, diff again exited with “memory exhausted” message.

I tried a number of tools, but none of them was giving the output I wanted (similar to that of diff -u).

As I needed that diff for backup only, in the end, I use xdelta tool:

# xdelta delta -9 full/2007-09-25-full.sql incr/incr.sql 2007-09-25-sql.delta

It gives a binary delta file, but is able to produce it using much less memory than diff.

Ooh, and if your files are so big that xdelta is failing for you…

# xdelta delta -9 full/2008-06-01-full.sql incr/incr.sql 2008-06-02-incr.delta.temp
xdelta: mmap failed: Cannot allocate memory

...you may consider using xdelta3. The deltas it produces are a big bigger, but it works much better (read: does not fail) with large files. Syntax:

# xdelta3 -9 -f -e -s full/2008-06-01-full.sql incr/incr.sql 2008-06-02-incr.delta.temp

Still, it would be great to have a tool which could produce diff -u -style output for large files.