TOP SECRET I I Communications Security Centre de la s?curit? Establishment Canada des telecommunications Canada CSEC ITSINZE Cyber Threat Discovery DISCOCON 2010 A CND Perspective . Safeguarding Canada?s security through information superiority Preserver la s?curit? du Canada par la sup?riorite? de i?information a . 1 TOP SECRET I I Communications Security Centre de la s?curit? Establishment Canada des telecommunications Canada N2E Cyber Threat Discovery Discovery: Context Sitrep - Current Capabilities - CND Metadata Analysis at Scale Safeguarding Canada?s security through information superiority Cr" Pr?server la s?curit? du Canada par la sup?riorit? de i?information a TOP SECRET I I Communications Security Centre de la s?curit? Establishment Canada des telecommunications Canada i - - N2E Context Within CND operations concentrated in N2 - Core: incident response team (alerts analysis mitigation advice) Malware/RE VR teams 0 Discovery has always been a required activity within N2 IR often takes precedence N2E established to focus effort on discovery Hunting vs. firefighting: new threats, techniques, tradecraft Iterations of hypothesis based research/analysis What evidence is there in the data of compromised systems? - What new threats or techniques can we find operating against us? Development of effective anomaly detection/heuristic techniques - How can we better detect these new threats in the future? Shape capability development priorities Safeguarding Canada?s security through information superiority dl? Pr?server la s?curit? du Canada par la sup?riorit? de I?information a 3 ?45 TOP SECRET I I Communications Security Centre de la s?curit? Establishment Canada des telecommunications Canada ?Sitrep w. . Team formally established earlier this year Growing to 7; hiring underway, strong candidates in pipeline - Excellent access to full take data - Analytical environment improving - Progress on policy support Use of intercepted private communications Sharing mechanisms Safeguarding Canada?s security through information superiority - Pr?server la s?curit? du Canada par la sup?riorit? de I?information a s: - $3 I 53? TOP SECRET I I Communications Security Centre de la s?curit? Establishment Canada des telecommunications Canada Our Haystack - Three individual clients 10?s - low 1000?s of signature based alerts each day (majority false positive) Sooni ?official? Internet gc.ca aggregation point - Full pcaps (retention: days to months) 1?s - 10?s TB of passively tapped network traffic each day - Metadata (retention: months to years) 10?s - 100?s GB of non?indexed, textual, network metadata/ day (descrip Safeguarding Canada?s security through information superiority dl?" Pr?server la s?curit? du Canada par la sup?riorit? de I?information a TOP SECRET I Communications Security Centre de la s?curit? Establishment Canada des telecommunications Canada Toolset - Homemade wrappers for heuristic detection tools - Popquiz/Slipstream for metadata find xargs grep sed out awk sort uniq ?c perl python Safeguarding Canada?s security through information superiority IOI Pr?server la s?curit? du Canada par la sup?riorit? de I ?information TOP SECRET I I Communications Security Centre de la s?curit? Establishment Canada des telecommunications Canada l-lomemadeWrappers: SMTP processing wrapper to process all email attachments Cuts and re-rebuilds SMTP session from full packet captures - Extracts and logs metadata from SMTP and RFC822 - Extracts attachments and sends it to a cluster of SBall services Wrapper able to manage other deep scan tools AV) Aggregates results from scans Generates alerts for to look at with full SMTP pcap - Catches new implants or first stage delivered using attachments Safeguarding Canada ?5 security through information superiority dl? Pr?server la s?curit? du Canada par la sup?riorit? de I?information a. 5:2, TOP SECRET I Communications Security Centre de la s?curit? Establishment Canada des telecommunications Canada Homemade Wrappers; Stripsearch Automated binary analysis Recursively runs binary through the following tools: Correlation with file repository and reports Future: greater degree of automation n;c? lumbanaCc Liw :?ff?mation .. . :34beCUer Cm -7. 35:5; 530.! 5a pc'iuflae o?e inTOlffsatiu . a .. . i 3" "Tw(I) TOP SECRET Communications Security Centre de la s?curit? Establishment Canada des telecommunications Canada Safeguarding Canada?s security through information superiority Pr?server la s?curit? du Canada par la sup?riorit? de I ?information? 1 "at . Metadata Standard Flows (with protocol guessing) - DNS Queries Answers and beaconing SMTP RFC 788 (SMTP), RFC 822 (2822, 5322) Internet Message Format HTTP All server-side headers User-Agent summary List of POST without a GET, URLs and PE downloads IOI Canada TOP SECRET Communications Security Centre de la s?curit? Establishment Canada des t?l?communications Canada - Canad'?' 1o TOP SECRET I I Communications Security Centre de la s?curit? Establishment Canada des telecommunications Canada m, . - Noteworthy Catches WMl?based implant - Trend Micro has a good report this type of implant - Uses WMI for persistence instead of Registry or Boot sector Reside in ActiveScriptEventConsumer First seen in September 2009 See and Caught with Pony Express Stripsearch fuzzy correlation linked it with previous reports Safeguarding Canada ?5 security through information superiority (1WI Pr?senxer la s?curit? du Canada par la sup?riorit? de I?information a TOP SECRET Establishment Canada des telecommunications Canada 2:137. f} .. . I Communications Security Centre de la s?curit? Noteworthy Catch 3 Google IP addresses used as a sleep command From DNS metadata When call back domain was queried, a Google IP was used as a sleep command. Searching for Google In DNS metadata revealed new TROPPUSNU domains and infected workstations. Suspicious RDP and TOR sessions - From Flow metadata we identified: RDP Incoming RDP sessions from various locations to the RDP service (uses the same certificate) Still under investigation TOR Was picked-up as a suspicious burst of outgoing SSL connections going to several locations from a smgle source Safeguarding Canada?s security through information superiority a Pr?server la s?curit? du Canada par la sup?riorit? de I?information TOP SECRET Communications Security Centre de la s?curit? Establishment Canada des telecommunications Canada . - 1? s: v. .-. #4 Tog: Noteworthy Catches: - Most likely a breach of Internet usage policy - All SSL CERTS were: Safeguarding Canada?s security through information superiority Pr?server la s?curit? du Canada par la sup?riorit? de I?information 1.V Self-signed Certificate has a validity window of just 2 hours 101017152421Z to 101017172421Z) Issuer name appears to be a randomly generated fully qualified domain name (FQDN), unique for each destination IP Common Name (CN) field appears to be a randomly generated FQDN unique for each session IOI Canada TOP SECRET I I Communications Security Centre de la s?curit? Establishment Canada des telecommunications Canada Noteworthy Catches: - All sessions for October 17 extracted from pcap using the following ngrep HEX string: ngrep pcap ?tq 2019/10/17 15:49:00.699087 40101715242 I The string is the end of validity period (Date only) - Snort signature could be generated for looking at CERTS with validity period starting/ending same day '.x.x.43 61469 Safeguarding Canada?s security through information superiority dl? Pr?sen/er la s?curit? du Canada par la sup?riorit? de I ?information an a. TOP SECRET I Communications Security Centre de la s?curit? Establishment Canada des telecommunications Canada - Ant" CND Metadata Analysis at Scale Selected SAWUNEH Results it cob fook you 65 299 Secon Safeguarding Canada?s security through information superiority dl'" Pr?server la s?curit? du Canada par la sup?riorit? de I?information a. . EA TOP SECRET I Communications Security Centre de la s?curit? Establishment Canada des telecommunications Canada . . ,Em?mf wwr, ., . 13wGeneral Problem: Information Overload - What is ?at scale? for CSEC - Problem: too much data Acquired, retained, summarized, analyzed, presented to Opportunities at all tiers of system. Selected SAWUNEH topics . help in one or more of these areas Safeguarding Canada?s security through information superiority dl? Pr?server la s?curit? du Canada par la sup?riorit? de l?information a TOP SECRET I Communications Security Centre de la s?curit? Establishment Canada des telecommunications Canada - Literally: ?Summer Analytic Workshop Up North Annual data analysis workshop hosted at CSEC by shop 2010 thrust was CND Reps from each of the 5 eyes cleared researchers - Helps fill the (sparsely inhabited) plane that often exists between OpsDev and Special Assets: Netezza, Cray XMT Safeguarding Canada?s security through information superiority - dl? Pr?server la s?curit? du Canada par la sup?riorit? de I 'information a a. 1' -. TOP SECRET I I Communications Security Centre de la s?curit? Establishment Canada des telecommunications Canada . . u, g: "r J. Topic 1: Correlating ?E-Mail, Web, Metadata E-mail based content delivery attacks against 60 *very* common - CND capability rapidly developed deployed; heavily relies on attachment scanning The inevitable evasion Attacks now commonly use email as inducement to URL visit, instead of direct content delivery. . - Too many benign hyperlinks delivered over email Need to reduce metadata presented to front line Safeguarding Canada?s security through information superiority dlt'l Pr?server la s?curit? du Canada par la sup?riorit? de I?information a '3 TOP SECRET I I Communications Security Centre de la s?curit? Establishment Canada I des telecommunications Canada Reducing Email URL metadata volume delivered to front line - Select highly suspicious subset of Email URL metadata based on correlation - Only present ?Email metadata if: Email was inbound Contained one or more hyperlinks Hyperlink nominated as suspicious Hyperlink was actually visited by a recipient. Provide flow and metadata for resulting HTTP session Safeguarding Canada?s security through information superiority dl? Pr?server la s?curit? du Canada par la sup?riorit? de l?information a. TOP SECRET I I Communications Security Centre de la s?curit? 1 Establishment Canada des telecommunications Canada Tepie 1: Cerreiating ?E-Maiig Web? Fiew? Metadata SAWUNEH appream - Import EMAIL, URL, FLOW metadata into separate tables on the NETEZZA appliance (distributed db) - Single SQL Query to fill requirement?53 ?361! 5?8 I lnrOhhdLIOi: a TOP SECRET I I Communications Security Centre de la s?curit? Establishment Canada des telecommunications Canada 1 . ?3?3?1'3?34? ion ?Email Web? attack detect SAWUNEH Style Findings: - 50% of the unique results were malicious Significantly lower false positive rates compared to URL inspection only - Result provides anomaly tip but also context at fingertips for analyst Next Steps: Better definition of ?suspicious? URL Automate extraction of web content from session for - Automate feeding of web payloads into content scanning system 2 partial mitigation of the original evasion. Higher analyst productivity Safeguarding Canada?s security through information superiority Pr?server la s?curit? du Canada par la sup?riorit? de l?information at . .4. .- TOP SECRET I Communications Security Centre de la s?curit? Establishment Canada des telecommunications Canada -, . Topic Malicious Attachment Prediction Problem: - Our current email scanning system fully processes every email it sees in a modular fashion Works very well today, but we can?t afford to scale this approach ?as?is? to meet future requirements. Large proportion of time spent in *late* stages of processing. especially deep scan Possible Approach: - Can we predict which emails will score in the deep scan based only on metadata extracted during earlier processing phases? If so, can we drop those that have low probability of deep scan success? Safeguarding Canada?s security through information superiority dl?? Pr?server la s?curit? du Canada par la sup?riorit? de I ?information a TOP SECRET I Communications Security Centre de la s?curit? Establishment Canada des telecommunications Canada Malicious Attachment Prediction 0 Given metadata from each processing phase, build a predictive model to selectively promote emails with high probably of non-zero scoring in 8-ball based on features extracted early in processing Choice of Predictive Model Random Forests: decision tree based classifier (Brieman 2001) - Take a feature set Xp) and a corpus of training samples with *known* classification as input. Generate Random Forest of decision trees to be used to prediction classification New samples pass through previously trained forest which yields probability that deepscan will be nonzero Safeguarding Canada?s security through information superiority dl?" Pr?server la s?curit? du Canada par la sup?riorit? de l?information a - TOP SECRET I I Communications Security Centre de la s?curit? Establishment Canada des telecommunications Canada . . w? Malicious Attachment Prediction Data Reduction vs. ?lnteresting? Data Loss Safeguarding Canada?s security through information superiority a Pr?server la s?curit? du Canada par la sup?riorit? de I?information TOP SECRET I I Communications Security Centre de la s?curit? Establishment Canada des telecommunications Canada .- . - rle?u-y WV Malicious Attachment Prediction Result: - 85% data reduction with 1-3% loss of ?interesting? emails Can discard majority of emails with minimal loss in positive hits - Prediction with model is relatively cheap (10,0005 features per second) Next steps: Extract those features which contribute most to prediction and incorporate appropriate filters in each stage of email processing system with aim to discard early and often There are likely a few simple features contributing disproportionally to predictive power: flow size, attachment size, attachment type etc Safeguarding Canada?s security through information superiority . dl?l Pr?server la s?curit? du Canada par la sup?riorit? de I?information a 1 TOP SECRET I I Communications Security Centre de la s?curit? Establishment Canada des telecommunications Canada (Mini) Topic 3: PE Masquerade Detection - Multiple threat actors observed masquerading Windows PE downloads using ?benign? filename extensions (jpg, gif, etc) - We log metadata on HTTP sessions containing a PE header in content of first packet (modules available in both SLIPSTREAM and POPQUIZ) Exclude entries with suffixes common to PE downloads (exe, bin, php, asp) Provide contextual metadata 1 click access to payloads Easy to see entries that jumped out: file extensions such as .jpg, .gif highly suspicious, often malicious - Also filter by ?uniqueness? over time to eliminate AV OS updates Next Step: automate extraction of pe masquerades and push through content scanning system Safeguarding Canada?s security through information superiority Pr?server la s?curit? du Canada par la sup?riorit? de l?information a I . . ?up-"sat 7m 3?23? "12? 4; .31TOP SECRET I I Communications Security Centre de la s?curit? Establishment Canada des t?te?communications Canada TOP SECRET I Communications Security Centre de la s?curite Establishment Canada des t?i?eommunications Canada TOP SECRET I Communications Security Centre de la s?curit? Establishment Canada des telecommunications Canada '7 m" I ?anFinal Notes SAWUNEH outcomes made a few things clearer to us Our days of plain old grep as primary search tool are nearing an end - Relational provide us with a few key benefits: Simple indexing of existing metadata (sub second searches) ability to correlate easily across ?primary' metadata outputs significantly faster hypothesis testing - Depending on scale, DB over head may be too costly In such cases, still very useful tool for discovery work on finite snapshots _Case be used .to test correlation hypothesis on finite data sets before spending significantly more cycles implementing a streaming on-line version. Simple correlation techniques can provide high yield metadata to Islands of primary metadata are becoming unwieldy Post processing of primary metadata (correlation, newness, uniqueness, reduction techniques etc) becoming a requrrement to mitigate Information overload. Safeguarding Canada?s security through information superiority dl'rl Pr?server la s?curit? du Canada par la sup?riorit? de I?information a TOP SECRET I Communications Security Centre de la s?curit? Establishment Canada des t?i?communications Canada Thanks cse-cst.gc.ca Safeguarding Canada?s security through information superiority Pr?server la s?curit? du Canada par la sup?riorit? de I ?information 1 TOP SECRET I I Security Cemre de la s?nunl? Canada des tel?commumcations Canada IT i DISCOCON 2010 A CND Perspective rum-:11Icmlilw Hmmuh infurnmliun Mum/mnw Cr J?u?xl?ivwl wum'u? 4m {fawn/n pm In lli? 1 TOP SECRET I Communications Security Centre de la s?curil? Establishment Canada des telecommunications Canada I . . . N2E Cyber Threat Discovery - Discovery: Context Sitrep - Current Capabilities - CND Metadata Analysis at Scale Safeguarding Canada's security through information superiotity Cal] Pr?selvel la s?cun?te (lu Canada pa: in supe?noiit? (Ie l'infommlion a a TOP SECRET I I Communications Security Centre de la s?curil? Establishment Canada des telecommunications Canada z. Wig-t N2E Context - Within CND operations concentrated in N2 Core: incident response team (alerts analysis mitigation advice) Malware/RE VR teams Discovery has always been a required activity within N2 IR often takes precedence - N2E established to focus effort on discovery Hunting vs. firefighting: new threats, techniques, tradecraft Iterations of hypothesis based research/analysis - What evidence is there in the data of compromised systems? What new threats or techniques can we find operating against us? Development of effective anomaly detection/heuristic techniques - How can we better detect these new threats in the future? - Shape capability development priorities Sztfegtuurlt?ny Canada's sectuiiy through information superiuiily a lit du Canada pm In superimir? ti?c I?infotnit'tlion a. a. TOP SECRET I Communicallons Security Centre de la s?curit? Eslablishrnanl Canada das telecommunications Canada N2E Sltrep Team formally established earlier this year Growing to 7; hiring underway, strong candidates in pipeline - Excellent access to full take data - Analytical environment improving - Progress on policy support Use of intercepted private communications Sharing mechanisms if?i?ff?f?l?i$113521; 2513:1135 31531131175?, Canad'a' TOP SECRET I I Communicalions Secunty Canlre de la s?curil? Estabfishment Canada des I?l?comrnunlcations Canada Vamps-u? my Our Haystack - Three individual clients 10?s - low 1000?s of signature based alerts each day (majority false positive) - Soon: "official" Internet gc.ca aggregation point - Full pcaps (retention: days to months) 1's - 10's TB of passively tapped network traffic each day - Metadata (retention: months to years) 10's - 100?s GB of non-indexed. textual. network metadata day (descrip Safeguarding Canada's Security tl?nough inl'mnmriun superiority Carl Preseivm Ia secmite du Canada pm In (lo a a - Notes on gc.ca: ?System capacity at 400TB per month 'Expected to start at per month (68 departments) but expected to grow to capacity the next 2 years (>100 departments) TOP SECRET I I Communicall?ons Secumy Centre do In securila Establishmanl Canada des telecommunications Canada Toolset - Homemade wrappers for heuristic detection tools - Popquiz/Slipstream for metadata 5% 2 hide; find xargs grep sed I out awk sort I uniq 4c perl python mfmmuliun superiority supiwmm: (Jrffin?na Jig/+53.53.: - *LvS?klmh- Chianti-33.6." i" 5 i We have a bunch of heuristic tools which are good at detecting unknown malware. They are mainly wrappers around tools like 8Ball and metadata produced out of raw network traffic. Detection from wrappers handled through alert management system but metadata analyzed manually. Using good old unix text TOP SECRET I I Communications Cemre de la securll? Establishment Canada das lel?communcations Canada Homemade Wrappers: PonyExpress SMTP processing SBall wrapper to process all email attachments - Cuts and re-rebuilds SMTP session from full packet captures - Extracts and logs metadata from SMTP and RFC822 - Extracts attachments and sends it to a cluster of 8Ball services Wrapper able to manage other deep scan tools AV) Aggregates results from scans - Generates alerts for to look at with full SMTP pcap - Catches new implants or first stage delivered using attachments 5 SW Hle lnturnmliun superiority l'Ir-sr-rtn-r Lt r'lu 'tftl [Ml In superlm'ltr77' 'Currently processes 400,000 emails per day 'These are all emails that were previously filtered by Anti- Virus softwares ?Out of these, 400 gets promoted to the alert system of these alerts are worthy of reporting to client 'With gc.ca coming 'PonyExpress will be in front of Anti-Virus, due to location of collection points. 'Looking at deploying appliances in front of PonyExpress TOP SECRET I I Communicalions Securily Centre (is In securila Establishment Canada (125 tel?commumcalions Canada Homemade Wrappers: Stripseareh Automated binary analysis Recursively runs binary through the following tools: - Correlation with file repository and reports - Future: greater degree of automation I?l an: min ifmmiili C, I Li llti Lainuht [nu lii siliimliurli} [It? 8 ?Modules are run in parallel and some sequentially: Filters are run first to exclude known good 'Each module can be selectively disabled or enabled 'Recursive means when 8Ball finds an embedded file, it extracts it and runs it through stripsearch independently 'Currently processes our malware repository (~1500 samples) in lh30 minutes. 'Capable of processing folders and subfolders TOP SECRET I I Communications Secunly Gentle de la s?curit? Establishment Canada des l?l?cornmunscalions Canada Metadata Standard Flows (with protocol guessing) DNS Queries Answers and beaconing SMTP RFC 788 (SMTP), RFC 822 (2822, 5322) Internet Message Format HTTP All server-side headers User-Agent summary List of POST without a GET, URLs and PE downloads am imlv rhmriuh ?rim-lining! I I l'n'sr?rvm Li rlii pm Li supmimili: (ll? iv,? my .. ,a?swezr -. . - - 9 - SMTP comes from PonyExpress All others from popquiz HTTP server headers and User-Agent summary are based on the first payload packet so when headers are using more than one packet data gets truncated. Sorting yelds strange side effects like showing nice staircase-like output User?Agent: User?Agent: bla User-Agent: blab User?Agent: blabl User?Agent: blabla Found some SQL injection attempts in Server String: TABLE servertypes; TOP SECRET I I Security Centre de la s?cunl? Eslablishmenl Canada (165 lel?communicallons Canada Metadata: :fi ndmask Stirvgluimlinq sr-r'mily llmvuqh inlummrinla stunts/?wry C, Pump/w; I4 [1.11 supuiummi (In d. 10 When we have more than 7 10 unique known strings in the same flow its usually worth investigating. Anything less is most likely a false positive. Looks for strings such as: 'Windows API calls 'Part of Registry keys 10 TOP SECRET I Communicalions Security Centre de la securiie Establishment Canada das telecommunications Canada ?we? m? Noteworthy Catches WMI-based implant - Trend Micro has a good report this type of implant - Uses WMI for persistence instead of Registry or Boot sector Reside in ActiveScriptEventConsumer First seen in September 2009 See and Caught with Pony Express Stripsearch fuzzy correlation linked it with previous reports Since first version, we saw increase in sophistication At first WMI event created through a script run by CSCRIPTEXE Now, is able to create the WMI event from the binary Safegtrarr?r?illu Canada's security through information superiority can (in. Pr?smver son-unite riu Canada par in srrperioritr} (ie I'iniornmrion a a ll TOP SECRET I Communications Security Centre de la s?curit? Eslablishrnanl Canada dos l?l?communicalions Canada Noteworthy Catche ?is Google IP addresses used as a sleep command From DNS metadata When call back domain was queried, a Goo Ie IP was used as a sleep command. Searching for Google in DN metadata revealed new TROPPUSNU domains and in ected workstations. Suspicious RDP and TOR sessions From Flow metadata we identified: RDP Incoming RDPsessions from various locations to the RDP service (uses the same certificate) Still under investigation TOR Was picked-up as a suspicious burst of outgoing SSL connections going to several locations from a smgle source :5an. a C: Sr). ilytl i i lr'J up? ?01 Pr?Zthi/le from; (it: .3 12 TOP SECRET I I Communications Security Centre de la secunle Establishment Canada des telecommunications Canada . Noteworthy Catches: - Most likely a breach of Internet usage policy All SSL CERTS were: Self-signed Certificate has a validity window of just 2 hours 101017152421Z to 1010171724212) Issuer name appears to be a randomly generated fully qualified domain name (FQDN), unique for each destination IP b2wwzduvdc5jyty357. net) Common Name (CN) field appears to be a randomly generated FQDN unique for each session Safeguarding Canada?s security through inlonnalion superiority Pro-rervcr la secular!) Ilu Canada pm {a sup?llm'ilri- de I'infuunnliun a - l3 TOP SECRET I I Communications Security Centre de la secunl? Establishment Canada des telecommunrcations Canada Noteworthy Catches: - All sessions for October 17 extracted from pcap using the following ngrep HEX string: ngrep pcap ?tq ?x@70031303130? 2010/19/17 15:49:00.699087 - .10101715242 17242120%1#o! . . . The string is the end of validity period (Date only) - Snort signature could be generated for looking at CERTS with validity period starting/ending same day am umv through C, d5" Priest-INN kl. (In I Cancun-i pen in rlr? C1 - taThe easiest signature would be to have snort look at starting end ending in the same month. However, the signature will have to be changed every month. A search for October only looking at CERTs ending in October (1010) resulted in a lot of false positives but easily managed by greping start date of October. Only one false positive left. [4 TOP SECRET I Communicallons Securin Centre de la s?curil? Establishment Canada des l?l?communicalions Canada . . CND Metadata Analysis at Scale Selected SAWUNEH Results Constafulnfians, ?1 it only fuck you 65299 seconds Safeguarding Canada's secumy through infounation supeu?ovity Preserve: In s?curilc du Canada par In sup?lioril? de I?infolmnrion a. 15 TOP SECRET I I Communicatlons Security Centre do In securile Establishment Canada das telecommunications Canada "Letizm ?use General Problem: Information Overload What is ?at scale' for CSEC Problem: too much data Acquired, retained. summarized, analyzed, presented to - Opportunities at all tiers of system. Selected SAWUNEH topics help in one or more of these areas security through Information superiority C, Ia s?uurr?te u?u Canada par la (in 8, Today: 1?10?s TB pcap retained each day 10?s-100?s GB of metadata retained each day Future: metadata continues to increase linearly with new access points attempt to hold pcap steady at current rates through smarter retention policies Tiers: l6 TOP SECRET I Communicalions Secunty Centre de la s?curit? Establishment Canada des telecommunications Canada 25?? mm i What is Literally: ?Summer Analytic Workshop Up North Annual data analysis workshop hosted at CSEC by shop 2010 thrust was CND Reps from each of the 5 eyes cleared researchers - Helps fill the (sparsely inhabited) plane that often exists between OpsDev and - Special Assets: Netezza, Cray XMT Safeguarding security through information superiority Carl dP' Preserve: in s?rztuir? rlu Canada [Jill Ia sup?rimil? do l'inlonnation a. 3. Many researchers but focus is ?applied? Netezza: distributed data appliance made available to facilitate workshop OpsDev vs rather than Ops vs 17 TOP SECRET I Communications Security Centre de la s?curll? Establishment Canada das telecommunications Canada may . Topic 1: Correlating ?E-Mail, Web, Flow? Metadata - E-mail based content delivery attacks against GC *very* common - CND capability rapidly developed deployed; heavily relies on attachment scanning - The inevitable evasion Attacks now commonly use email as inducement to URL visit. instead of direct content delivery. - Too many benign hyperlinks delivered over email Need to reduce metadata presented to front line Safeguarding Canada?s security through superiority Prusmvm? In sdcuriic? rlu Canada pm In sup?n?m?ile rle a PonyExpress: Yields many detections with low false positive rates Commercial sector has responded in similar fashion Email as inducement Delivery of HTTP URL over E-mail, exploit is delivered via HTTP 18 TOP SECRET I Communications Security Centre do la s?curit? Establishment Canada ties telecommunications Canada . arm mm . Mist Topic 1: Correlating ?E-Mail, Web, Flow? Metadata Reducing Email URL metadata volume delivered to front line - Select highly suspicious subset of Email URL metadata based on correlation - Only present ?Email metadata if: Email was inbound Contained one or more hyperlinks Hyperlink nominated as suspicious Hyperlink was actually visited by a recipient. Provide flow and metadata for resulting HTTP session Safeguarding Catnruin?s security through information superiority :1 Pin-servo: Ia soc-unto rlu Canada pm in sup?rimir? (in i'infummrinn a. Initial definition of ?suspicious? was naive: suffix dictionary Have actual Visit helps by: more likely to be successful attack means analyst likely has access to related web traffic to assist with triage 19 TOP SECRET I I Communications Secunly Centre de la semiril? Establishment Canada des telecommunications Canada Topic 1: Correlating ?E-Lv-Nlaii, Wei), i-inw? SAWUNIZH Import URL, FLOW metadata into separate tables on the NETEZZA appliance (distributed db) Single SQL Query to fill requirement: (Hindi/tr?. a: wily minimniium l?iwmru. I in xw-rmh- (hut-um pm In (In 20 20 TOP SECRET I I Communications Security Centre de la s?curlt? Establishment Canada 635 telecommunications Canada . Jim . M's? ?Email Web? attack detection SAWUNEH Style Findings: 50% of the unique results were malicious Significantly lower false positive rates compared to URL inspection only - Result provides anomaly tip but also context at fingertips for analyst Next Steps: - Better definition of ?suspicious' URL - Automate extraction of web content from session for Automate feeding of web payloads into content scanning system partial mitigation of the original evasion. Higher analyst productivity Safeguarding Canada's sectuily through information supmimify Preserve} la socmiro (in Canada par la superiorite de l'infmmallon Possible ?suspicious? definitions: URLS tagged with ?ObeavaScript, FlashContent, LedToDownload? etc Correlated on URL ?trees? not isolated URLS (otherwise redirects, remote hrefs etc defeat scanning).. But somewhat expensive (not sure if this is solved in feasible way primary metadata sources too large for need value-add reductions reduced ?working sets? of metadata higher yield for analyst time. They evade by moving payload delivery to different session. We follow them and (attempt) to close evasion. 21 TOP SECRET I I Communications Security Centre de la securil? Canada 695 telecommunications Canada ?un= Topic 2: Malicious Attachment Prediction Problem: - Our current email scanning system fully processes every email it sees in a modular fashion - Works very well today, but we can't afford to scale this approach ?as-is' to meet future requirements. - Large proportion of time spent in *Iate* stages of processing. especially deep scan Possible Approach: - Can we predict which emails will score in the deep scan based only on metadata extracted during earlier processing phases? - If so. can we drop those that have low probability of deep scan success? IOI Safeguarding Canada's snr'utity through information supmimily Preserve-I In scuuiilu tlu Canada pm in 'iup?limii? rle l'inmlmation Fully process sessionize, smtp, rf0822, mime, deep scan(8Ball) of all attach. . .etc Optimization axiom: find your hotspot 22 TOP SECRET I I Communications Securin Centre de la secunl? Establishment Canada des telecommunications Canada Malicious Attacl'imctit - Given metadata from each processing phase. build a predictive model to selectively promote emails with high probably of non?zero scoring in 8~ba based on features extracted early in processing Choice of Predictive Model Random Forests: decision tree based classifier (Brieman 2001) Take a feature set Xp) and a corpus of training samples with *known* classification as input. Generate Random Forest of decision trees to be used to prediction classification - New samples pass through previously trained forest which yields probability that deepscan will be nonzero (?mm/rim xvi-mitt! thmuuh infurrmtiun IOI I I I I.t simith rlu ?Atrial/it [Ml ilz' i'ili/Ullritillull Phases: (ip, tcp, smtp, rfc822, mime etc) Classifier: Based on probability of scoring non?zero in deep scan - Exam le features include: In our case: email metadata types 10000 previously scanned samples with full metadata including overall scan score used to build the Forest Note: Only build/train forest once. Subsequently you are just passing metadata through and getting an P(email scores non?zero) 23 TOP SECRET Communtcalions Security Centre de la s?curil? Establishment Canada das l?l?communicalions Canada Malicious Attachment Prediction Data Reduction vs. ?Interesting? Data Loss g- safeguarding Canada's security through Inlormation superiority Pr?server In s?cun?t? du Canada par la sup?riorit? de l'infommtian 24 TOP SECRET Establishment Canada des telecommunicalions Canada - Malicious Attachment Prediction CommunicationsSecunty Ceniredala securile Result: - 85% data reduction with 1-3% loss of ?interesting' emails Can discard majority of emails with minimal loss in positive hits - Prediction with model is relatively cheap (10,0005 features per second) Next steps: - Extract those features which contribute most to prediction and incorporate appropriate filters in each stage of email processing system with aim to discard early and often - There are likely a few simple features contributing disproportionally to predictive power: flow size, attachment size, attachment type etc Safeguarding Canada?s security through superiority Cal,l PHI-serum In socmi'fo rlu Canada pill In do a. d. Could be used either permanently, or simply as a learning tool to find which features you might want to filter on. Emphasis on feature predictors available early in processing (flow, smtp, 822 header rather than MIME content etc) 25 I?l Safeguarding Canada?s sen-urin through inmmiatiun superiority Pr'oserver la suctiritc du Canada pm in sup?limiie (Ic- TOP SECRET Security Centre de la securil? Establishment Canada des tal?commumcalions Canada ind. (Mini) Topic 3: PE Masquerade Detection Multiple threat actors observed masquerading Windows PE downloads using 'benign' filename extensions (jpg. gif. etc) We log metadata on HTTP sessions containing a PE header in content of first packet (modules available in both and POPQUIZ) Exclude entries with suffixes common to PE downloads (exe. bin, php, asp) Provide contextual metadata 1 click access to payloads Easy to see entries that jumped out: file extensions such as .jpg, .gif highly suspicious. often malicious Also filter by 'uniqueness' over time to eliminate AV OS updates Next Step: automate extraction of pe masquerades and push through content scanning system IOI Canada 26 TOP SECRET I . I Semmly Cenlre de la s?cunl? . Canada des (?l?cummunicalions Canada Canada's; mummy Unouqh mhmmumu unpmimily I'u'nwrw I 'wr-mur- va( .ur?um ihH L: xlvlim?nrmuirnn Canad'?i 27 27 TOP SECRET I I Semmty Genus the la s?mlnt? Canada des l?l?communicalinns Canada Safegmm?ng/ Cum? wattlin Hum/51h infuunatitm rammian i'n'uww-r [u .m-z Ium- 4m Lt supmimm' [In I Canad'?i 28 28 TOP SECRET I Communications Secunty Centre de la securile Establishment Canada (195 telecommunications Canada (11:5; Final Notes -, mm SAWUNEH outcomes made a few things clearer to us - Our days of plain old grep as primary search tool are nearing an end - Relational DB's provide us with a few key benefits: Simple indexing of existing metadata (sub second searches) ability to correlate easily across 'primary' metadata outputs significantly faster hypothesis testing - Depending on scale. DB over head may be too costly In such cases. still very useful tool for discovery work on finite snapshots _Case be used to test correlation hypothesis on finite data sets before spending significantly more cycles implementing a streaming I on-Iine versron. - Simple correlation techniques can provide high yield metadata to islands of primary metadata are becoming unwieldy Post processing of primary metadata (fcorreiation. newness. uniqueness, reduction techniques etc) becoming a requirement to mitigate in ormation overload. Safeguarding Canada's security through information superiority 1, in seculim riu Canada par la strperior'ite dc a V). 29 TOP SECRET I Conununicallons Secunly Cemre de la s?curil? Establishment Canada des Canada Thanks cse-cst.gc.ca Nulngunuvhm; mum/luter C, fay-