Blog > Digital Archive Image Quality

One of the common questions we are asked relates to the quality of the page images on UKPressOnline:

"Why are some of the pages so much crisper (and cleaner) than others?"

It's all to do with the way we take the originals into the database.

There are two ways to get a newspaper or magazine page into the archive, as a Print-PDF or as a scanned image:

Print-PDF pages were available from the year 2000 (but not for every publication). This is where the publisher actually produced the newspaper on a computer system and generated a PDF file to send to the printing press. A great number of the newspaper archive's pages are in these Print format PDF's. Our updates of the Daily Mirror, Daily Express, Daily Star, Daily Star Sunday, Sunday Express, Church Times and Morning Star are from the actual computer files from which the newspapers are printed that day. The main advantage of these files is that they give high-quality reproduction, both on-screen and on reprint. Text is represented with very clean edges and with 100% accuracy.

However, most of the pages pre-date the age of computer-imagesetting. The 'pre-computer' pages in UKPressOnline from 1835 to 2001 are scanned from the original paper editions or from microfilm. Some of these pages have been badly damaged or have suffered serious degradation such as damp, long-term exposure to sunlight and any other number of time-related problems. The newspapers from some days are missing because nobody has a copy of them. We put a lot of effort into finding these missing editions and adding them to UKPressOnline. We scan the pages to as high a resolution as we can, sometimes this is to 400dpi (Dots-Per_Inch), more often to 330dpi. These images are dependent on the quality of the original. We then use OCR (Optical Character Recognition) on the page image to make the text available and, again, the success of this process depends on the quality of the original. We will get much more accurate text from a clean page with large type than we will from an old, faded page with very small type. On the occasions where the OCR results are not particularly good we are happy to reprocess the page to improve the results (pages which have problems can be reported to us by subscribers via the 'Report Page' buttons).

To print out any of the pages we would recommend that you print from the PDF file. You may wish to use the 'fit to page' setting to fit the whole page on one sheet, but this can result in a page of very small type. If you do not have a large-format printer (A3 or larger) you may wish to print a tiled page to obtain text at full-size or to take the file to a local quick-print bureau.