From “Unplayable” to Searchable Online: the House Recordings Recovery Project

By Steven Kantner, Digital Asset Coordinator

One of the many reel-to-reel recordings marked “unplayable” by the Texas House of Representatives media staff.

In 2007, Texas House of Representatives’ Media Services transferred to the Texas State Library and Archives Commission (TSLAC) about 350 reels of audiotape. Most of the recordings dated between 1975 and 1984 and covered the House floor debates from the entire 63rd through 68th Legislative sessions. Many House committee recordings were included as well. At the time the tapes were transferred to TSLAC, the majority of the reels were described by House media staff as “unplayable.” Having been marked as damaged and unplayable, the audiotapes were stored in TSLAC’s climate-controlled stacks awaiting deaccessioning.

State Archives staff revisited this collection in 2017 after digitizing recordings from the House Textbook Committee and others from the late 1950s and early 1960s. Digital Asset Coordinator Steven Kantner, with a background in recording engineering along with a graduate school focus on the preservation of audiovisual materials, recognized the primary issue facing these tapes.

Samples from the House recordings. The few Scotch 207 and Ampex 631 tapes in the set did not require any treatment for playback. However, over 300 Ampex 407 tapes did.

The bulk of the audiotape used in the House recordings from this time period was Ampex 407. Ampex was once a well-known manufacturer of recording devices and produced their own brand of audiotape.

Residues on the surface of the tape’s black back coating, which is the primary suspect as to the increased occurrence of stickiness in tapes manufactured after 1970.
An Ampex tape exhibiting binder degradation. The tape is not falling off the pack tangentially as it would when new.

As years pass, audiotape is known to suffer from binder degradation, also known as “sticky-shed” or “sticky-binder” syndrome. Post-1970 audiotape construction has multiple layers that keeps magnetic and carbon particles attached to the support tape. Over time, these chemical bonds break down from exposure to humidity. Ampex 407 is no exception.

Tapes with this condition will squeal upon playback and can lock up the tape player altogether. This can damage the tape and the players too. While there have been various methods applied to attempt remediation of this degradation, the most successful and widely used is a heat treatment. A pilot test consisting of a random sample of the tapes was conducted to prove salvaging these recordings was possible.

Soon after the first project meeting in April 2018, the effort was underway. Using a scientific lab oven in the State Archives, a dozen reels of tape at a time were carefully heated at 130F/54C for a total of 24 hours. The tapes were cooled down for at least 24 hours before they would be played.

Preparing to bake reel-to-reel tapes in the State Archives oven.

The original Studer ReVox and Sony recorders used to create the tapes were not available. TSLAC bought a brand new Otari MX-5050 reel to reel player in 2014, about one year before Otari ended manufacture of these last modern reel-to-reel players. The original recorders had a tape speed option to slow the tape down to audio-cassette speed (1.875” per second). The Otari does not have that option and only uses faster consumer and production tape speeds.

Capturing a house recording with equipment in the State Archives Digital Lab.

Since no new reel players are on the market today, and working old ones are hard to come by, the recordings were captured at double their original speed, but at a very high digital resolution. This high resolution was to compensate for time duration adjustments after the digitization of the tape. This provided quality better than compact discs and kept audio transfers within digitization guidelines and standards from organizations such as the International Association of Sound and Audiovisual Archives.

While the bulk of the tapes just required heat treatment, some tapes exhibited other damage that occurred during the original recording or subsequent handling.

A tape that was stretched and curled upon itself. The poor tape pack seen here was commonly found on the reels. Some of the tapes continued to exhibit problems with tape pack even after rewinding and playback on the modern reel-to- reel player.

Some tape had strange white residues that formed around old fingerprints left on the tapes. It was determined after viewing under a microscope that it was not mold and was safe to handle.

Nearly all tapes were missing leader tape at the head or tail of the reels.

Splicing a tape and adding a new leader at the head of the reel.

Log books of the recordings were part of the original accession and contain useful metadata about the activities captured in the recordings. These were handwritten notes that included the “counter” information on the original recorder, which unfortunately is information only helpful with the original playback equipment and doesn’t equate to an accurate “time stamp.” However, representatives speaking and bill number information is useful to narrow down what was happening on any given day. These log books were digitized and are provided as a PDF file to browse through to look for names, bill numbers, and any other information a researcher may need. Each page of the PDF is bookmarked with Tape and Side where the audio resides and can be cross-referenced with the recordings.

Log books of the recordings were part of the original accession and contain useful metadata about the activities captured in the recordings.

The original project plan was to provide these to the public as MP3 files along with the PDF log books as an index. However, after some testing, it was found that using artificial intelligence for Automatic Speech Recognition (ASR) could be a powerful discovery tool for this collection. For over 1,000 hours recordings, it could cost the State thousands of dollars to send off to a vendor to perform. To hire people to manually write transcriptions would cost even more. Instead, an open source video software tool called ffmpeg was used to convert MP3 audio files into an MP4 video file using a placeholder “frame” for the video image. Then the MP4 was uploaded into a private channel on YouTube. Many of the recordings were just under the time limit set by YouTube, and YouTube (owned by Google and likely using a light version of Google’s ASR) would provide captions within about 24 hours after upload.

A screenshot of a House recording playing with the captions along the bottom of the screen.

The captions are not perfect as there are heavy accents, people speaking simultaneously, and other background chatter on the tapes that confuses the AI – but a large majority of the captioning is accurate. The caption files were downloaded and placed with the recordings. When topics are mentioned or House bill numbers are mentioned, this text is now searchable across the entire Texas Digital Archive – a text search will lead you to the captions – once the caption file is open, then use the FIND feature in your browser to search through the text in the record. A time stamp is included with each line of captioning to help the user pinpoint the audio in the recording. Using ffmpeg, captions were also permanently burned into the video frames so whole recordings are available not only as MP3 audio files, but also as video files with the captions.

The last audiotapes were captured about 15 months after the project kick-off, and within a couple of months all metadata and files were ready for ingest into the Texas Digital Archive. The collection, much of which was inaccessible for many years due to the tape condition, was now available to the public online.

Researchers using this collection have two options: use the log books to locate topics on a given day, or try a text search across a session or the entire collection. If using a text search, it is recommended to try several varieties of how a house bill or other topic could be mentioned. For example, “house bill 131”, “HB 131”, or just “131”. As technology advances further, future discovery improvements may be implemented to make searching and discovery within this large set of recordings even better.

Check out the collection here: Texas House of Representative Recordings

2 thoughts on “From “Unplayable” to Searchable Online: the House Recordings Recovery Project

  1. I believe in the 80/20 rule. you know, where 80% of all reward comes from 20% of the effort? Well, I believe your blog is that 20%. I’ve added you to the list of sites that I frequent. Thank you for the in depth and detailed blog posts. Not many people are willing to do that anymore.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.