By Steven Kantner, Digital Asset Coordinator
In 2007, Texas House of Representatives’ Media Services transferred to the Texas State Library and Archives Commission (TSLAC) about 350 reels of audiotape. Most of the recordings dated between 1975 and 1984 and covered the House floor debates from the entire 63rd through 68th Legislative sessions. Many House committee recordings were included as well. At the time the tapes were transferred to TSLAC, the majority of the reels were described by House media staff as “unplayable.” Having been marked as damaged and unplayable, the audiotapes were stored in TSLAC’s climate-controlled stacks awaiting deaccessioning.
State Archives staff revisited this collection in 2017 after digitizing recordings from the House Textbook Committee and others from the late 1950s and early 1960s. Digital Asset Coordinator Steven Kantner, with a background in recording engineering along with a graduate school focus on the preservation of audiovisual materials, recognized the primary issue facing these tapes.
The bulk of the audiotape used in the House recordings from this time period was Ampex 407. Ampex was once a well-known manufacturer of recording devices and produced their own brand of audiotape.
As years pass, audiotape is known to suffer from binder degradation, also known as “sticky-shed” or “sticky-binder” syndrome. Post-1970 audiotape construction has multiple layers that keeps magnetic and carbon particles attached to the support tape. Over time, these chemical bonds break down from exposure to humidity. Ampex 407 is no exception.
Tapes with this condition will squeal upon playback and can lock up the tape player altogether. This can damage the tape and the players too. While there have been various methods applied to attempt remediation of this degradation, the most successful and widely used is a heat treatment. A pilot test consisting of a random sample of the tapes was conducted to prove salvaging these recordings was possible.
Soon after the first project meeting in April 2018, the effort was underway. Using a scientific lab oven in the State Archives, a dozen reels of tape at a time were carefully heated at 130F/54C for a total of 24 hours. The tapes were cooled down for at least 24 hours before they would be played.
The original Studer ReVox and Sony recorders used to create the tapes were not available. TSLAC bought a brand new Otari MX-5050 reel to reel player in 2014, about one year before Otari ended manufacture of these last modern reel-to-reel players. The original recorders had a tape speed option to slow the tape down to audio-cassette speed (1.875” per second). The Otari does not have that option and only uses faster consumer and production tape speeds.
Since no new reel players are on the market today, and working old ones are hard to come by, the recordings were captured at double their original speed, but at a very high digital resolution. This high resolution was to compensate for time duration adjustments after the digitization of the tape. This provided quality better than compact discs and kept audio transfers within digitization guidelines and standards from organizations such as the International Association of Sound and Audiovisual Archives.
While the bulk of the tapes just required heat treatment, some tapes exhibited other damage that occurred during the original recording or subsequent handling.
Some tape had strange white residues that formed around old fingerprints left on the tapes. It was determined after viewing under a microscope that it was not mold and was safe to handle.
Nearly all tapes were missing leader tape at the head or tail of the reels.
Log books of the recordings were part of the original accession and contain useful metadata about the activities captured in the recordings. These were handwritten notes that included the “counter” information on the original recorder, which unfortunately is information only helpful with the original playback equipment and doesn’t equate to an accurate “time stamp.” However, representatives speaking and bill number information is useful to narrow down what was happening on any given day. These log books were digitized and are provided as a PDF file to browse through to look for names, bill numbers, and any other information a researcher may need. Each page of the PDF is bookmarked with Tape and Side where the audio resides and can be cross-referenced with the recordings.
The original project plan was to provide these to the public as MP3 files along with the PDF log books as an index. However, after some testing, it was found that using artificial intelligence for Automatic Speech Recognition (ASR) could be a powerful discovery tool for this collection. For over 1,000 hours recordings, it could cost the State thousands of dollars to send off to a vendor to perform. To hire people to manually write transcriptions would cost even more. Instead, an open source video software tool called ffmpeg was used to convert MP3 audio files into an MP4 video file using a placeholder “frame” for the video image. Then the MP4 was uploaded into a private channel on YouTube. Many of the recordings were just under the time limit set by YouTube, and YouTube (owned by Google and likely using a light version of Google’s ASR) would provide captions within about 24 hours after upload.
The captions are not perfect as there are heavy accents, people speaking simultaneously, and other background chatter on the tapes that confuses the AI – but a large majority of the captioning is accurate. The caption files were downloaded and placed with the recordings. When topics are mentioned or House bill numbers are mentioned, this text is now searchable across the entire Texas Digital Archive – a text search will lead you to the captions – once the caption file is open, then use the FIND feature in your browser to search through the text in the record. A time stamp is included with each line of captioning to help the user pinpoint the audio in the recording. Using ffmpeg, captions were also permanently burned into the video frames so whole recordings are available not only as MP3 audio files, but also as video files with the captions.
The last audiotapes were captured about 15 months after the project kick-off, and within a couple of months all metadata and files were ready for ingest into the Texas Digital Archive. The collection, much of which was inaccessible for many years due to the tape condition, was now available to the public online.
Researchers using this collection have two options: use the log books to locate topics on a given day, or try a text search across a session or the entire collection. If using a text search, it is recommended to try several varieties of how a house bill or other topic could be mentioned. For example, “house bill 131”, “HB 131”, or just “131”. As technology advances further, future discovery improvements may be implemented to make searching and discovery within this large set of recordings even better.
Check out the collection here: Texas House of Representative Recordings