unix - Extracting word from file using grep or sed -
i have file in format below:
file : \\dvtbbnkapp115\nautilus\030db28a-f241-4054-a0e3-9bfa7e002535.dip processed. entries found : 0 unarchived documents : 1 file size : 1 k error : following line not processed. bad document type. error : marketing , contact preference change update||7000003735||078ef1f3-db6b-46a8-bb0d-c40bb2296ab5.pdf file : \\dvtbbnkapp115\nautilus\078ef1f3-db6b-46a8-bb0d-c40bb2296ab5.dip processed. entries found : 0 unarchived documents : 1 file size : 1 k error : following line not processed. bad document type. error : declined - bureau data (process)||7000003723|252204|2f1d71f4-052c-49f1-95cf-9ca9b4268f0c.pdf file : \\dvtbbnkapp115\nautilus\2f1d71f4-052c-49f1-95cf-9ca9b4268f0c.dip processed. entries found : 0 unarchived documents : 1 file size : 1 k error : following line not processed. bad document type. error : unable call - please contact|40640510016710|7000003180||3e6a792f-c136-4a4b-a654-37f4476ccef8.pdf
i require extract pdf file names after double pipe , write them file. novice when comes unix/sed/grep commands, have tried no luck? ideas or examples use extract information above?
thanks
give try if want pdf filenames if follow double pipe characters , last thing on line:
sed -n 's/.*||\([^|]*.pdf\)$/\1/p' inputfile
the second pdf filename in example follows single pipe character, there earlier set of double pipes on line. should accommodate both styles of lines if filename part not include pipe characters:
sed -n 's/.*||.*|\([^|]*.pdf\)$/\1/p' inputfile
if filenames consist on hex digits , hyphens, can little more selective this:
sed -n 's/.*||.*|\([[:xdigit:]-]*.pdf\)$/\1/p' inputfile
Comments
Post a Comment