I have an array (single column) of filenames that contain IDs that I would like to extract to another column. The filenames have varying length and formats and the IDs can also have slightly differing formats, hence the difficulty matching values. I would like to extract the ID in a uniform format.
For example:
filenames: ------------>extracted IDs:
xxxxR01_A01xxxx --->R01A01
xxP23_H03xxx ------->P23H03
xP3H3xxxxxxx ------->P03H03
xxP23H03_A01xx ---->P23H03
My problem is identifying which part of the filename contains the ID value, since it can change but is generally consistent within each separate dataset. The first three characters and last three characters of the extracted ID value could be matched to values in a known set of possible values if necessary. Any help with either a formula or macro that cold achieve this would be greatly appreciated.
For example:
filenames: ------------>extracted IDs:
xxxxR01_A01xxxx --->R01A01
xxP23_H03xxx ------->P23H03
xP3H3xxxxxxx ------->P03H03
xxP23H03_A01xx ---->P23H03
My problem is identifying which part of the filename contains the ID value, since it can change but is generally consistent within each separate dataset. The first three characters and last three characters of the extracted ID value could be matched to values in a known set of possible values if necessary. Any help with either a formula or macro that cold achieve this would be greatly appreciated.