views:

71

answers:

1

I have the following text, I need to extract the exception name and the continuing sentence from the file, but the file has continuous sentences without a space.

??????>?????????????????????????????????????????????????????????????????????????????????????????????????????????????????
????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
?????????????????????????????????????????????????????????????????????????????????????????B????!48#$%&'+-/0123????5679<=@
>?CA????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
????????????????RootEntry?????????F?|?`???__nameid_version10?????????|?`???|?`??__substg10_00020102????????????__substg1
0_00030102????????????????????????????????????????????????????????????????????????????????????????????????!????#$???????
?????????????????+????/????12????456789<=>?@ABCDEFGHIJKLMNOP????RS????UV????????????Z[????????^_????????bc????????fghijk
lmnopqrstuv????????yz?????????????????????????F?F??????????IPMNoteaws-stg-c5-feeds9aws[mazarvoiceSMTPAppender]Applicatio
nmes__substg10_00040102????????????????__substg10_10060102????__substg10_10140102????????__substg10_10150102????????????
__substg10_001A001F????__substg10_0037001F?????????????__substg10_003B0102????$__substg10_003D001F????????????????sageSM
TPBVAPPLICATION@mazarVOICECOM??@??B??+/??/O=ad/OU=ad/CN=RECIPIENTS/CN=_OPERATIONSSUPPORT_OperationsSupport?+????n?T?bvap
plication@mazarvoicecomSMTPbvapplication@mazarvoicecom__substg10_003F0102????P__substg10_0040001F????????????$__substg10
_00410102?????__substg10_0042001F????????????<bvapplication@mazarvoicecom??@??B??+/??/O=ad/OU=ad/CN=RECIPIENTS/CN=_OPERA
TIONSSUPPORT_OperationsSupportEX/O=ad/OU=ad/CN=RECIPIENTS/CN=_OPERATIONSSUPPORTEX/O=ad/OU=ad/CN=RECIPIENTS/CN=_OPERATION
SSUPPORTSMTPbvapplication@mazarvoicecom__substg10_00430102????P__substg10_0044001F????????????$__substg10_00510102????7_
_substg10_00520102????????????7__substg10_0064001F????__substg10_0065001F????????????<__substg10_0070001F?????__substg10
_00710102????????????aws-stg-c5-feeds9aws[mazarvoiceSMTPAppender]Applicationmessage??E????????A??H???EX/O=ad/OU=ad/CN=RE
CIPIENTS/CN=_OPERATIONSSUPPORTEX__substg10_0075001F????__substg10_0076001F????????????f__substg10_0077001F????__substg10
_0078001F????????????f/O=ad/OU=ad/CN=RECIPIENTS/CN=_OPERATIONSSUPPORT?+????n?T?bvapplication@mazarvoicecomSMTPbvapplicat
ion@mazarvoicecombvapplication@mazarvoicecomSMTPBVAPPLICATION@mazarVOICECOMSMTP__substg10_007D001F?????__substg10_0C1901
02????????????"?__substg10_0C1A001F&????%<__substg10_0C1D0102????????????&$MicrosoftMailInternetHeadersVersion20Received
fromdalmailadsolutionscom[17216977]byblrexchadsolutionscomwithMicrosoftSMTPSVC6037903959Sat20Feb2010213019+0530Receivedf
rombarracudaadcom[1721682]bydalmailadsolutionscomwithMicrosoftSMTPSVC6037903959Sat20Feb2010100012-0600X-ASG-Debug-ID1266
681611-1c039f4d0001-azmk4tReceivedfromna3sys009aog114obsmtpcomna3sys009aog114obsmtpcom[74125149211]bybarracudaadcomwithS
MTPidoe0BsQlWiwvBTxEofor<_OperationsSupport@adcom>Sat20Feb2010100011-0600CSTX-Barracuda-Envelope-Frombvapplication@mazar
voicecomReceivedfromsource[241551448]byna3sys009aob114postinicom[7412514812]withSMTPIDDSNKS4AHC703Mnh+uM8i9u1uucP76tMiGb
r6@postinicomSat20Feb2010080012PSTReceivedfrompsmtpcom74125149120byAUSBDCaustinmazarvoicecom100023withMicrosoftSMTPServe
rid821760Sat20Feb2010095958-0600Receivedfromsource[723214889]usingTLSv1byna3sys009amx236postinicom[7412514810]withSMTPSa
t20Feb2010080009PSTReceivedfromaws-build-systemawsaws-build-system[172200110]byc0mailmazarvoicecom8138/8138withESMTPido1
KG08md020220for<dev-log4j@mazarvoicecom>Sat20Feb2010100008-0600Receivedfromaws-stg-c5-feeds9awsaws-stg-c5-feeds9aws[1020
969139]byaws-build-systemawsPostfixwithESMTPid6D6A864294for<dev-log4j@mazarvoicecom>Sat20Feb2010100008-0600CSTReceivedfr
omaws-stg-c5-feeds9awslocalhost[127001]byaws-stg-c5-feeds9awsPostfixwithESMTPid548C1801ABfor<dev-log4j@mazarvoicecom>Sat
20Feb2010100008-0600CSTDateSat20Feb2010100008-0600From<bvapplication@mazarvoicecom>To<dev-log4j@mazarvoicecom>Message-ID
<644663928511266681608162JavaMailtomcat@aws-stg-c5-feeds9aws>X-ASG-Orig-Subjaws-stg-c5-feeds9aws[mazarvoiceSMTPAppender]
ApplicationmessageSubjectaws-stg-c5-feeds9aws[mazarvoiceSMTPAppender]ApplicationmessageMIME-Version10Content-Typemultipa
rt/mixedboundary="----=_Part_51_8002724931266681608155"X-pstn-neptune0/0/000/0X-pstn-levelsS4081870/9990000CV999000FC955
390LC955390R959108P959108M970282C986951X-Auto-Response-SuppressDROOFAutoReplyX-Barracuda-Connectna3sys009aog114obsmtpcom
[74125149211]X-Barracuda-Start-Time1266681611X-Barracuda-URLhttp//17216828000/cgi-mod/markcgiX-Virus-Scannedbybsmtpdatad
comX-Barracuda-Spam-Score001X-Barracuda-Spam-StatusNoSCORE=001usingglobalscoresofTAG_LEVEL=35QUARANTINE_LEVEL=10000KILL_
LEVEL=90tests=BSF_SC0_SA_TO_FROM_DOMAIN_MATCHNO_REAL_NAMEX-Barracuda-Spam-ReportCodeversion32rulesversion32223024Rulebre
akdownbelowptsrulenamedescription----------------------------------------------------------------------------000NO_REAL_
NAMEFromdoesnotincludearealname001BSF_SC0_SA_TO_FROM_DOMAIN_MATCHSenderDomainMatchesRecipientDomainReturn-Pathbvapplicat
ion@mazarvoicecomX-OriginalArrivalTime20Feb20101600120277UTCFILETIME=[C8AD525001CAB245]------=_Part_51_80027249312666816
08155Content-Typetext/plaincharset="us-ascii"Content-Transfer-Encoding7bit------=_Part_51_8002724931266681608155--__subs
tg10_0C1E001F!????'__substg10_0C1F001F????????????<__substg10_0E02001F$????????__substg10_0E03001F????????????????bvappl
ication@mazarvoicecomdev-log4j@mazarvoicecomaws-stg-c5-feeds9aws[mazarvoiceSMTPAppender]Applicationmessage00000002BLREXC
H/O=ad/OU=ad/cn=Recipients/cn=_OperationsSupportMicrosoftExchangeServer__substg10_0E04001F#%????4__substg10_0E1D001F????
?????????__substg10_0E28001F"????-?__substg10_0E29001F????????????0?00000002BLREXCH/O=ad/OU=ad/cn=Recipients/cn=_Operati
onsSupportMicrosoftExchangeServerZxLZFu#O?rcpg125?2CtexA???????PV?U?%Qch??set2?%?3F?03???05"`cP3d36P?0?2-?100a8WARNA?ghi
b?a?utJD@BCExce0iR???SQLHEr`r???S??!a8?????ERROR??OZ?%?rpd?%?URL="jdbcmysql//?stg-c5-m?13306@/bv2?a%0o@?nn?t=t?r__substg
10_1000001F'????"?#__substg10_10090102????????????3^__substg10_1035001F????Q?__substg10_10F3001F????????????T?02-2010000
8WARNorghibernateutilJDBCExceptionReporterSQLError0SQLState0800102-20100008ERRORorghibernateutilJDBCExceptionReporterSQL
exceptionraisedforJDBCURL="jdbcmysql//stg-c5-dbmst13306/bv2?autoReconnect=true&useUnicode=true&characterEncoding=utf-8"!
MESSAGEServerconnectionfailureduringtransactionDuetounderlyingexception'javanetSocketExceptionjavanetConnectExceptionCon
nectiontimedout'BEGINNESTEDEXCEPTIONjavanetSocketExceptionMESSAGEjavanetConnectExceptionConnectiontimedoutSTACKTRACEjava
netSocketExceptionjavanetConnectExceptionConnectiontimedoutatcommysqljdbcStandardSocketFactoryconnectStandardSocketFacto
ryjava156atcommysqljdbcMysqlIO<init>MysqlIOjava284atcommysqljdbcConnectioncreateNewIOConnectionjava2672atcommysqljdbcCon
nection<init>Connectionjava1474atcommysqljdbcNonRegisteringDriverconnectNonRegisteringDriverjava266atorgapachecommonsdbc
pDriverConnectionFactorycreateConnectionDriverConnectionFactoryjava37atorgapachecommonsdbcpPoolableConnectionFactorymake
ObjectPoolableConnectionFactoryjava291atorgapachecommonspoolimplGenericObjectPoolborrowObjectGenericObjectPooljava771ato
rgapachecommonsdbcpPoolingDataSourcegetConnectionPoolingDataSourcejava95atorgapachecommonsdbcpBasicDataSourcegetConnecti
onBasicDataSourcejava548atsunreflectGeneratedMethodAccessor530invokeUnknownSourceatsunreflectDelegatingMethodAccessorImp
linvokeDelegatingMethodAccessorImpljava25atjavalangreflectMethodinvokeMethodjava597atorgspringframeworkaopsupportAopUtil
sinvokeJoinpointUsingReflectionAopUtilsjava310atorgspringframeworkaopframeworkReflectiveMethodInvocationinvokeJoinpointR
eflectiveMethodInvocationjava182atorgspringframeworkaopframeworkReflectiveMethodInvocationproceedReflectiveMethodInvocat
ionjava149atorgspringframeworkaopframeworkadapterThrowsAdviceInterceptorinvokeThrowsAdviceInterceptorjava126atorgspringf
rameworkaopframeworkReflectiveMethodInvocationproceedReflectiveMethodInvocationjava171atorgspringframeworkaopframeworkJd
kDynamicAopProxyinvokeJdkDynamicAopProxyjava204at$Proxy20getConnectionUnknownSourceatorgspringframeworkormhibernate3Loca
lDataSourceConnectionProvidergetConnectionLocalDataSourceConnectionProviderjava82atorghibernatejdbcConnectionManageropen
ConnectionConnectionManagerjava417atorghibernatejdbcConnectionManagergetConnectionConnectionManagerjava144atorghibernate
jdbcAbstractBatcherprepareQueryStatementAbstractBatcherjava105atorghibernateloaderLoaderprepareQueryStatementLoaderjava1
561atorghibernateloaderLoaderdoQueryLoaderjava661atorghibernateloaderLoaderdoQueryAndInitializeNonLazyCollectionsLoaderj
ava224atorghibernateloaderLoaderdoListLoaderjava2145atorghibernateloaderLoaderlistIgnoreQueryCacheLoaderjava2029atorghib
ernateloaderLoaderlistLoaderjava2024atorghibernateloadercriteriaCriteriaLoaderlistCriteriaLoaderjava94atorghibernateimpl
SessionImpllistSessionImpljava1533atorghibernateimplCriteriaImpllistCriteriaImpljava283atorgspringframeworkormhibernate3
HibernateTemplate$36doInHibernateHibernateTemplatejava1061atorgspringframeworkormhibernate3HibernateTemplatedoExecuteHib
ernateTemplatejava419atorgspringframeworkormhibernate3HibernateTemplateexecuteWithNativeSessionHibernateTemplatejava374a
torgspringframeworkormhibernate3HibernateTemplatefindByCriteriaHibernateTemplatejava1051atorgspringframeworkormhibernate
3HibernateTemplatefindByCriteriaHibernateTemplatejava1044atcommazarvoiceccadaohibernateModelDAOHibernatefindModelDAOHibe
rnatejava189atcommazarvoiceccadaohibernateAbstractCriteriaDAOHibernatefindAbstractCriteriaDAOHibernatejava113atcommazarv
oiceccaloggingserviceimplLoggingConfigurationServiceImplupdateFiltersLoggingConfigurationServiceImpljava104atcommazarvoi
ceccaloggingserviceimplLoggingConfigurationServiceImplupdateLoggingConfigurationLoggingConfigurationServiceImpljava95atc
ommazarvoiceccaloggingserviceimplLoggingConfigurationServiceImpl$1runLoggingConfigurationServiceImpljava69ENDNESTEDEXCEP
TIONAttemptedreconnect3timesGivingupP&uU??$???En??g=%0f-8"!?ESSAGE?!`av?+?'?fp?@dqp0?q-?DP1??oun?ly1?''!`'java?+?tSock?%
?!`4wP+?5i6?'??%@?`t'"?"?BEGIEN/0TED!X?%?PTIO9?9?4%?"?/'67/89??ACKTRB?/??<?5???@?9%??$?0mcD?o?3D?F-??y+?JoKsD1?56H?I?Mr@
?<?it>P?M?284N/I?GXK?o$??@S?Qt6??Q?STP?UN14?7Q?I?N"g??aD/?K?\?]?U??N$Ra??%Iq`??!p]?GXKG?T?GXb?M37`O?a_bcPo??ld??KF?D?Obj
Li??f<291g?hO`?fpPG?qc?k?k??!!wk?p??M?77noiH]R$??aD?p%?gd?sv?M?95toubcB>a?wOd?|?M?54?8Z?s3flP?p?$?dMhp?Ac%??5?ppnvoD??kn
?r0?}d???HDelP?g$?a?|Ip????????Qtz^D`p?W?t7???%?9g/$asp?1rPa?w?k{??ob??pbA?0?U%Ab@??Jo?`??tU?p@??e???f?P?????????/??tI??
c?!2Q?}?????M?8V~?o????????D??d?/???49?o???aL?0]?Thr!sAdv??e???%?????o??uM?V@????????????]tO?O?_Jdk?Dy$?m???P`^xK?????mv
0Z?$????w??????I??$?3L??l}GX??????w?????M??O?$\S?Mpaw?]??0??d???M?4????/?Ow??Z&Z???J@A?b?-?|?t{?]???%??eQPK?!??????M?0z_
$k????]???????????M????????do?3??`??{????A3??P?1i?zT?LazyP?l??s??22???????L]??Z????????w?Ig????C{???????????????]?]"D0C?
U?J???y???$kp?`??????M????????_Qu???o?H$?T???Q?$?Q??????????%??`%?r?%0?????????o'WP??hN?!/???_y?g??o???3pdB???U??5????!?
?%??Wrb??a-?????K???bP????M???AO#?"??P?_???&/'??&6?????1~O-Q??pg\??qs`??pt???5#F1`g}???P?5???p??!0F?0?!???67?$?2?3?4?6?9
?8'???DA?oz?<??=?>?C7?$1?P?0B?C??c?M?ENDNESTEOEXCEPTUNN?M?At3???d?]?3???`b@G]?w?p?4}Tp<644663928511266681608162JavaMailt
omcat@aws-stg-c5-feeds9aws>aws-stg-c5-feeds9aws%3A[mazarvoiceSMTPAppender]ApplicationmessageEMLt<????G???k???__substg10_
300B0102+-????W__substg10_3FF8001F????????????X<__substg10_3FF901020????Y?__substg10_3FFA001F????????????\<bvapplication
@mazarvoicecom?+????n?T?bvapplication@mazarvoicecomSMTPbvapplication@mazarvoicecombvapplication@mazarvoicecom?+????n?T?b
vapplication@mazarvoicecomSMTPbvapplication@mazarvoicecom__substg10_3FFB0102/2????]?__substg10_8000001F????????????`2__s
ubstg10_8001001F14????a?__substg10_80020102????????????d2MicrosoftExchangeServer00000002BLREXCH/O=ad/OU=ad/cn=Recipients
/cn=_OperationsSupportMAPI//00000002/00000000@00?z?`??@00?z?`????J>???j4y?f??6__properties_version10035????e?__recip_ver
sion10_#00000000????????9?|?`???|?`??__substg10_0FF60102????????????w__substg10_0FFF010268????x?&67?@9??E??$??P?@&A??B>C
P?D&Q7?R7?de>p?q?uvhwxh}???>$?>@?+??E?????#^?5???????0????o??????>??????>???????@@@@v@y@?4???2?=???+????n?T?dev-log4j@ma
zarvoicecomSMTPdev-log4j@mazarvoicecomdev-log4j@mazarvoicecomSMTPdev-log4j@mazarvoicecomSMTPDEV-LOG4J@mazarVOICECOMdev-l
og4j@mazarvoicecom__substg10_3001001F????????????{2__substg10_3002001F7????|__substg10_3003001F????????????}2__substg10_
300B0102<????~__substg10_3A20001F????=????2__properties_version100??????????????????????????????????????????????????????
????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
?????????????040040?4@??0

I need to extract the following keywords: ConnectException, javanetConnectException and any exception as well as its exception type.

Please help me in writing a regex for this?

A: 

It's really hard to come up with a regex when there's no easy way to determine where your sentence ends. This regex will match your two exceptions followed by one uppercase letter followed by any number of lowercase letters:

(ConnectException|javanetConnectException)[A-Z][a-z]*

When I copy/paste the sample from your question, the pasted text includes line breaks and the space at the end of each line. If those occur in your actual file too, strip them out first by searching for \s+ and replacing with nothing.

In your sample my regex finds 3 matches after stripping spaces and line breaks:

javanetConnectExceptionConnectiontimedout
javanetConnectExceptionConnectiontimedout
javanetConnectExceptionConnectiontimedoutatcommysqljdbc
Jan Goyvaerts