I have a super complicated bash script that extracts results from a large output file (produced on a LINUX machine, just in case this is relevant). As part of this process, I use combinations of grep, head, tail, etc that extract sub-sections from this larger file; this sub-section of text is then saved to a temporary file which is then further processed. I have produced a simpler example here so I can frame my question, which is:
How can I avoid the need to save to this temporary file?
What I would like to do is, rather than save this sub-section of text to a temporary file, I would like to save the sub-section of data (including carriage returns) to a bash variable which can then then be processed further.
The problem is the bash scripts I am writing do not ‘see’ the carriage returns. In my example below, I have a file ‘exampledata.data’ containing the following text:
START_BLOCK #1
line a b c
line b
END_BLOCK #1
START_BLOCK #2
Line 1 2
Line 2 7
Line 3
Line 4
END_BLOCK #2
START_BLOCK #3
Line x s d e f
END_BLOCK #3
My original script (which saves to a temporary file) works as expected, with the awk command correctly displaying the 2nd token for all lines within each ‘block’:
#!/bin/bash
file="examplefile.data" # File to process
totblock=`grep "START_BLOCK" $file | wc -l` # Determine number of blocks of data in file
# Current implementation - which works
for ((l=1; $l<=${totblock}; l++)); do # Loop through each block of data
echo "BLOCK "$l
# display file contents -> extract subsection of data for current block -> Remove top and bottom -> Save to temporary file
cat $file |
sed -n '/START_BLOCK #'${l}'/,/END_BLOCK #'${l}'/p' |
grep -Ev "START|END" > TEMPFILE
# Perform some rudimentary processing on this temporary file to check the overall process is working
awk '{print $2}' TEMPFILE
done
rm TEMPFILE
If I then attempt to save what would have been saved to TEMPFILE to a bash variable (bashvar), all carriage returns are lost resulting in one long line. As a consequence, the awk command essentially only shows the 2nd token of the first line, which is not what I want:
#!/bin/bash
file="examplefile.data" # File to process
totblock=`grep "START_BLOCK" $file | wc -l` # Determine number of blocks of data in file
# New implementation with the aim to avoid the need to write to a temporary file (TEMPFILE)
for ((l=1; $l<=${totblock}; l++)); do
echo "BLOCK "$l
# As above but rather than piping the output to a file, save it to a bash-variable
bashvar=`cat $file |
sed -n '/START_BLOCK #'${l}'/,/END_BLOCK #'${l}'/p' |
grep -Ev "START|END"`
# Perform the same rudimentary test to confirm the overall process is working
echo $bashvar | awk '{print $2}'
done