linux - Extract information from RTF file in shell script -
we have many rtf files need upload in oracle ebs respective category. need read info stored in document properties of rtf file. these fields title, subject, author, company , category.
when open rtf file in notepad, can see info not sure how extract using linux command. using grep wasn't successful.
i pasting here part of rtf file holds info
\mwrapindent1440\mintlim0\mnarylim1}{\info**{\title ^xxsls_gbl_ordack^}****{\subject xxsls}****{\author ^es_es,es_fr,es_it,es_de^}**{\doccomm $header: xxsls_gbl_ordack_es_es.rtf $} {\operator }{\creatim\yr2012\mo11\dy11\hr14\min3}{\revtim\yr2013\mo3\dy2\hr10\min43}{\version24}{\edmins361}{\nofpages4}{\nofwords725}{\nofchars14202}{\*\manager }{\*\company }**{\*\category ^bd^}**{\nofcharsws14898} {\vern32773}}{\*\userprops {\propname _dochome}\proptype3{\staticval -974575144}}{\*\xmlnstbl {\xmlns1 http://schemas.microsoft.com/office/word/2003/wordml}}\paperw11850\paperh18144\margl851\margr851\margt851\margb0\gutter0\ltrsect
can please suggest how can extract info follows:
title=^xxsls_gbl_ordack^ subject=xxsls author=^es_es,es_fr,es_it,es_de^ category=^bd^
grep can -e (advanced regex) flag , -o (only matching output) flag.
title=`grep -oe 'title [^\}]+' file.rtf | sed 's/title //g'` echo "title=$title" subject=`grep -oe 'subject [^\}]+' file.rtf | sed 's/subject //g'` echo "subject=$subject" author=`grep -oe 'author [^\}]+' file.rtf | sed 's/author //g'` echo "author=$author" category=`grep -oe 'category [^\}]+' file.rtf | sed 's/category //g'` echo "category=$category"
i get
title=^xxsls_gbl_ordack^ subject=xxsls author=^es_es,es_fr,es_it,es_de^ category=^bd^
Comments
Post a Comment