Removing very large corrupt records using the Verify utility

The Verify utility would crash in version 232 when trying to cleanup a file that contained a 'record' that was longer than 512,000 bytes.
(Keep in mind that the maximum length of a MARC21 record is 99,999 bytes).

There was a problem in the utility (and in MARC Report overall) that would cause this behavior, and that has been fixed. The program should now be able to cleanup a file containing very large (invalid) records with no problems.

The disk cache used by the utility is 512,000 bytes, and one reason the problem arose is because a single 'record' spanned several of these disk blocks without being resolved (its quite normal for a record to span disk blocks).

This type of problem (records with an invalid record length) occurs several times in almost every database we receive. Sometimes the record length is written as 6 bytes to accomodate the size of the record, and sometimes it simply starts counting over again at 0 when byte 99999 is reached (like a car's odometer). The problem seems to arise when attempting to export every holding for a (serial) record that is held many (many) times by every member in the system1).

If only we had MarcXml–there would be no limits!

If you do run into a problem that might be related to this, please contact us for assistance.

1)
one vendor's product gets around this issue by breaking the record into smaller, valid records, during export. But this creates another problem for batch processing: the unique system ID is no longer unique
233/verify_invalid_reclen.txt · Last modified: 2021/12/29 16:21 (external edit)
Back to top
CC Attribution-Share Alike 4.0 International
Driven by DokuWiki