Using the MARC Review 'String List' search to find errors reported by MARC Analysis
Here is a practical application of MARC Review's string list search feature.
MARC Analysis (included with MARC Report) is a useful tool for discovering information about the MARC Databases we work with. It can also highlight problem areas, and even very specific errors in our records.
For example, the following excerpt from a MARC Analysis run on a newspaper index database of 300,000 records revealed these strings and occurrence counts in the 008 Date 1 element:
29 O: 1 3 : 1 33 : 1 4 : 28 4-JA: 1 5 : 71 5 Ma: 1 6 : 21 7 : 120 7/JA: 1 8 : 83 8997: 1 9 : 1 91 : 1 94 : 17 95 : 11 9550: 1 96 : 6 97 : 4 98 : 3 994 : 13 995 : 13 996 : 4 997 : 3 AR 1: 1 E 19: 5 ER 1: 2 June: 1 Sept: 1 Spri: 2 T 19: 2 UG 1: 3 UL 1: 2 V 19: 1 Y 19: 3 er 1: 7 erfo: 1 ly -: 1 r 19: 2 t 19: 1 th c: 1 ug 1: 1 une : 7 y 19: 5 y/Ma: 1
Although it was enlightening for the customer to find out about these problems, they (of course) then wanted to get a file containing only these records so they could quickly fix them.
Here are the steps we followed to match and extract only the records with the bad 008/Date1 fields:
- Cut out the rows containing the 008 Date1 errors from the MARC Analysis report and paste them into a new text file
- Open that file in a text editor and remove the leading spaces and everything after the colon; save the file
- Start Marc Review, press Next, enter '008', press TAB, click on 'Date 1', then press Save
- Back on the pattern form, right-click on the Data box, select the 'Simple string list' option, then select the text file created above
- Press Next, click 'MARC Output', select 'Matching records only', press Next, then press Run.
Thats it!
Note: as the data being searched in this example is fixed (i.e. each item is four bytes), no regular expression is necessary in the pattern.
For more details on list searching visit this page
Back to top