libxmldiff - a simple library to diff XML files
Download - v0.2.6 (30/09/2008)
For out-of-the-box win32 binaries for xmldiff, the command line tool demonstrating libxmldiff, please go to the xmldiff page
Download the current libxmldiff source code..
Download binary debian package (v0.2.5). with apt-get, please go to the debian distribution page, or ubuntu (v0.2.6) in apt repository
MinGW : MinGW binaries for Dev-CPP (v0.2.5), Development files for Dev-CPP, you may also prefer this as DevPak : DevPak page
MSVC : Binaries for MSVC(v0.2.6), Development files for MSVC (v0.2.6)
Note : development packages may not always be latest CVS version.
What ?
libxmldiff aims at providing efficient diff on XML files.
Features are :- Detection of modified items, added items, removed items
- Not sensitive to item position changes (based on a id).
- Designed to support large XML files (about 100 Mo). For larger files, treat these files with XML Pre Diff Tool would be a good idea.
- Basic XML manipulation (XSLT, Deletion,...)
- Scripting abilities
- Simple to use
How expensive is it ?
These programs are released under the GPL license. It allows you to use the program at no charge, and freely adapt the provided source code to your needs.
How does it work ?
libxmldiff diff two files, and output a file with exactly the same structure (unlike other xml diffing utilities), and containing an extra diff:status attribute.
The meaning of this diff:status argument is :
- added : the element has been added.
- removed : the element has been removed.
- modified : either an argument or the text has been modified, the values will be outputted with the '|' separator : "before|after".
- below : the element itself was not modified, but a child item was.
libxmldiff can use identifiers (attributes or nodes) to see exactly which element has been added or removed, despite the order of the elements.
Here is a sample output :
<test diff:status="below" xmlns:diff="http://www.via.ecp.fr/~remi/soft/xml/xmldiff">
<file diff:status="added" id="2"/>
<att diff:status="modified" id="1" old="tata|toto" removed="|toto"/>
<file att="tot|" diff:status="modified" id="12">
<name diff:status="modified">toto.dat|toto.cfg</name>
<!-- Test -->C'est toto !
</file>
<file diff:status="added" id="24"/>
<tulipe diff:status="modified" id="42">Tulipe|Tulipe 2</tulipe>
<toto diff:status="added">Titi !</toto>
<section1 diff:status="below">
<section2 diff:status="below">
<section3>Test</section3>
<section3 diff:status="removed">Test</section3>
</section2>
</section1>
</test>
This diff:status attribute is very easy to handle in XSL Stylesheet, and you can make the XSL for the better display of your xml document.
More information available in the code documentation and for the algorithm here.
How to use ?
To use the xmldiff command line tool, please see the xmldiff dedicated page
Using libxmldiff should be straight forward with the comments in headers. The generated documentation can also help. Tutorial and developper documentation will follow shortly.
Changelog
2006-03-02 22:37 [0.2.4] remi * Major changes in non-regression test unit : - test support now other operation than simple diff - expected results are no more included - command.lst format was modified * Fixed crash with wrong XSLT files * Implemented xsltSaveToFilename (fix omit-declaration) * New feature : merge action * Implemented namespaces in delete action 2006-02-14 00:02 remi * Support of parameters in xslt * Increased the number of arguments to 25 ; now is a #define. * Conversion console -> UTF8 for XSLT arguments * Handling of variables in XSLT arguments * Take care of <xsl:message terminate="yes" /> 2006-01-06 19:47 remi * Fixed bug reported by Jorge Robles - Tests provided 2005-08-06 16:42 [0.2.3] remi * Boolean argument now are set to 'yes' if no second member was given * Fixed parser bug on invalid arguments * Diff strings (ns, xmlns, attr) as arguments * If set to 'no', no diff namespace will be used 2005-07-30 19:14 remi * Kludged a crash on Linux for similar documents * Fixed CDATA bug * Do not create the output file when no differences * New ignore option * Added non-regression tests 2005-05-28 20:13 remi * Added minimalistic build * Fixed namespace problem * Added --merge-ns option * Added --keep-diff-only option * Fixed help message * Fixed removed element handling when optimizing memory 2005-05-01 00:46 [0.2.2] remi * Fixed Namespace bug on imported nodes (xmlReconciliateNs) * Added cleanPrivateTag function * Better Error Handling * Fixed bug in namespace in elements 2005-03-10 21:54 remi * Added debian packaging system * Removed VS6 useless warnings * Fixed a strcmp redefinition (if it still complains, rebuild all) * Solved (partially) xmlFree segfault with DevCPP : static link of a mingw build of libxml2 * TODO: Added some todo's 2005-03-05 18:01 remi * Linux Build System * DevPak generation * Added main Header of the libxmldiff library 2005-02-13 19:04 remi * Initial New Tree (with DevCPP and VS support) * bin/xmldiff.gui: Gui It command file * BugFix : problem with force-clean on 1-char text 2004-08-08 01:42 [0.2.1] remi * Refactored for use in xmlTreeNav * Fix of forceClean in xslt : this option does not make sense on xslt files 2004-07-05 23:59 remi * Added decent exception handling (first step, contents should be completed) * BugFixes : - auto-save issue : flush is called at the end of a script execution - number of nodes problems - doNotFreeBeforeItems when optimiseMemory = false * Reuse alias in xslt transforms 2004-06-26 20:39 remi * Added xslt/exslt transformation * Added scripts & script parameters 2004-06-06 01:28 remi * Splitted Operations design (no backwards compatibility) * Implemented xmldiff progress bar callback * Implemented diffOnly & doNotFreeBeforeNodes options * recalc now take check if modified items are still modified 2004-06-01 00:11 remi * src/: libxml2_utils.h, xerces_utils.h, xmldiff_xerces.cpp: Files removed while refactoring the code. 2004-06-01 00:10 [0.2.0] remi * Code refactoring ; it is now split into : - xmldiff : contains program specific items (command line parsing, options,...) - lx2_diff : diff algorithm implemented for libxml2 - lx2_utils : libxml2 usefull functions (and string handling) It actually does work on win32 with the same functionnality as before (Xerces). Run under Windows and Linux. * Namespace in attributes are now handled properly. 2004-05-23 22:50 [0.1.0] remi * First Import in CVS. Some files are taken from xmldiff previous module

