If you un-pbo the sounds.pbo, you can find all the in-game music, sound fx, and voice files. The music and sound fx files have logical names, and are easy to browse to find the one you want (i.e., birdsing_01, crickets_01, etc.).
Unfortunately, the voice file names give no clue to their contents: Examples:
UNIV_r30 <-- "Cover us!" (little denotes radio voice)
UNIV_v01 <-- "Get out of here!" (little v denotes a non-radio voice)
Under the Missions directory within the unpacked Sounds PBO directory, there are nine folders, with same sound files repeated but by different voice actors (BRIAN, DAN, DUSAN, HOWARD, JEFF, MATTHEW, RPOERTPOLO, RUSSEL, RYAN). This is cool, because we now know that the dialogue "Cover us!" is the same file name "UNIV_r30" under each voice actor. This will reduce the effort necessary to catalog these canned voice files.
There are also mission specific dialog voice files that would all have to be cataloged separately. Examples:
M01r01.ogg <-- "Echo this is crossroad. The convoy is approaching. Eliminate as many infantry as you can and disappear." M01 is for Mission 1, little r is radio, 01 is first radio voice file in first mission.
TRr02.ogg <-- "Return to your task, or you will be charged!". TR is Training mission, little r is radio, 02 is second voice file used in mission.
If I get time, maybe I will work on cataloging some of this, but no promises.
Has anybody started this already somewhere else?