Advanced HTTP Activity Analysis 2009 Goal The goal of this training is to get you familiar with basic HTTP traffic and understand how to target and expliot it using X-KEYSCORE Agenda What is HTTP stands for Hypertext Transfer Protocol and it?s the primary protocol for transferring data on the World Wide Web Why are we interested in ?myspaeennm. i nlace- Iii-r t?riiarlihr Because nearly everything a typical user does on the Internet uses HTTP facehook I I 'co?m' GO 9 ?ght" @meitru? Why are we interested in i Almost all web-browsing uses HTTP: . Internet surfing Webmail (YahoefHotmailiGmailletc.) OSN (FacebeoklM yS pace/etc.) Internet Searching (GoogleiBing/etc.) Online Mapping (Geogle Maps/Mapquestfetc.) How does HTTP work? . HTTP is comprised of requests from clients to servers and their corresponding responses . Many are already familiar with the terms ?client-to-server? or ?server-to-client" collection (also referred to as ?client side? or ?server side? collection). How does HTTP work? . A ?Client? is usually referring to a Browser (like Firefcx or IE) which is also referred to as the ?User Agent? . The "Server" can also be referred to as the ?web-server? or ?origin-server" which is the machine that is storing the data that is being accessed (like a web-page, a map, an inbox, etc) HTTP Activity 3. HTTP activity comes in Me types: Websitecem - Client-te-Sewer "requests" Sewer Server-te-Client ?responses? Client User HTTP Activity 3. HTTP activity comes in two types: Website.cem - Sewer Hi I While there may be a variety at" Proxies, Gateways er Tunnels in between the client and liq the server, traf?c is always geing in one directien Ghent 2K. or the ether. User Client vs. Server Side Traffic It How do you know which side you?re looking at? i-Client-to-Seryer requests are generally small in size and are computers talking to other computers They contain standard HTTP header fields like ?Host? ?Accept? ?Connection? etc. HTTP Activity Examples 'Client-to-Server request: Til-F" 5 HUD-GET 1-: Flewdl'; lw'naraiuzm HEWDEH ENIFurrnat Seminar. 1? WT mm: TWH USE'J??gtl'?i Ma?a-?int Wind-3w; HI 1; :li?Ufjjl 131pch :ltl'lr?lL: 111:1: Herlmj Sheree-'1 Swim-"JR": FIE-war ?aw-nan mam manual:an ?int-apt m1 '5 fizz-:pt Emmi?Lg; gajp?u?ntc?bziplsich Ciel-Lie: 113:: 113:: ??n?pt-Tanglag?' F?-U?len fizz-:pt Chemist: ISO 3 Heat: 5111;313:311 cc-m Czarzlt'tli'rn Keep-FIJin Client vs. Server Side Traffic Server-to-Client responses are generally larger in size and are what web-pages look like at the internet. iWhen you?re at a computer accessing the Internet, you?re only seeing Server-to-Client traffic. HTTP Activity Examples Server-to-Client Response: TI Document "rps: Fists? Fr Hall-,- L."srsish l'I-l'll I'Jlsulntr Flu.- 'Jl-ll Bsnus questish: are the images in this web-page missing? =11 ll'l'l'F' ads-I I1ItIrI1 ntitH1 SH Wines Ilili? 5 'JiiIlJ- [sud sshilsnl 11:31 us 2-..smr Paar Latest Hews In: Kuwait *rrsigjts? mlmn} Til-?His Han! in" Mar mm is Huwaltl qnu m1th r? has In: Fr: Til-?lm? l-i smarter-11:1; rssiyndliun in EUL-ll'll'f'b Hrnir dlrliLl win sir Twit?. nuts-rthe- premier?s ?fths -m .. H5231: Juuld mu; . . . rue-sugwauzul' has tans-n FE'ms am: I ?man-the arm {rulernintnr. .1 nmr-nt ul' Lsastr Arier Tsr- rn ?Jr at rs 'n.lrl h; the 1r-ltrr ml rut nil rm jinn?: R11 $11721: Les-l; Faults-3E WI- L'h t: :e Pussies tn:- F'srs an Gull 1cum.- to ease the :nft'ue global 'Il'lcl'lCI-i: 3. T?r- gr .rrunn r?t l'1i'. 1nns1nr?rrl rm th': ls}. 1. Mt.- I Suzi-Tsle f' HTTP Activity XKS HTTP Activity Meta-data differs greatly depending on which side of traffic we?re celiecting 8* In nearly all cases it?s better to have client-to-server traffic HTTP literver mnahaara - EFT Accept: Renter: 1111:1313: 1313:. an. Accept?meage User-Agea . Hazillafa.? [campat:h;e; HSIE 6.0; Hindaws NT 3.1; .IEJEEHHIHIJMI Ennki? CECILE- El ConnectLan: Haat UHL F'ath URL My: Elrawaar [anmpatihla MSIE NT 5.1; Sim Search Terms Language Via mua harraf an Rafarar ht?: earch? 3530 2133 aalaEE Evin-45196 afaDS-ll HTTP Activity Server-to?Client HT .3. - Film-alt 'I'Iir?lijl'l?' war - 1m" -: II'Il-L-rl'1c1j Eu Ir 'IT'r'u- tat}- Tr-n Liz-?u: 1.1: In: I - - anl' Zlif .E -: .1 -Ii .-: F.11rr lune. l-L'Ilwuil - I - tI I'll: has suhn'urttD-rl rt: In n'llill .1 mm pr: a; tl'l: Eta-numb: =I'n'l t's I. 1: air 'l 551.11'l'.ll'll: I: fr. .11 I5.- . 'l TI -- I'l I.--.I-I I If: I.'r -: ads-:3: l' -1: HTTP Activity HTTP Types Meta-data will also tell you which side of traffic you?re looking at Client-to-server has two main types: Server-to-cl' I .5 HTTP Type only one: HTTP Activity Get vs Post A is you requesting data from the server (most web surfing) EA is you sending data to the server signing in, filling out a form, composing an E?mail, uploading a ?le etc.) Let?s break down the important parts of a client-to-server request HTTP Client-to-Server GET themehtml Heat: User?Agent: Meziller'?? {Windewe; Ll; Windewe NT 5.1; en?US; ruz?i 3.0.13} GeekeiED?QD-?rE-E?ii? 25} Firefei-LlEi?fi Accept: Accept-Language: gzipdeflete Accept?Chareet: Heep?Alive: BUD Cenneetien: Heep-alive First thing to note is the Host: line which tells you the name of the server that the client is requesting data from Host Field It?s important to note, that in many oases users think they?re at websites like but behind the scenes data is coming from a number of different servers without the user knowing it: si'i?z?a till-3111 Es 334.": . 2' Jul .I5. Elli! F'hJ-i'th: rt. 5 mail nit-rt:- "5 a rootlets-:1 12:11]: Ft st; 3- Er. gag;- . de?ate 5. HT $2773.71- - Tl 31-7-12; 'I'Ei'l Bonus question: What would the impact of this be in how you formulate your queries using the Host field? HT GET i'hcrne- Heat: samplewebsiteccn?i User?Agent: Mezillai?? Windews NT 5.1; en?US; 25} Firefexf3.Ei.'lEi Accept: Accept?Language: Accept-Enccding: gzipldetlate Accept-Charset: Heep?Alive: EDD Ccnnecticn: keep-alive Second the GET line tells you which files the user is requesting from the server. If you simply take that line and append it to the Host line you have the live public URL that the user is reques?ngz HT GET Host: samplewebsiteccn?r User?Agent: Mozillai'?? Windows NT 5.1; en?US; 25} Firefcxf??t'l? Accept: Accept?Language: Accept-Encoding: gzipldetlate Accept-Charset: Heep?Alive: EDD Connection: keep-alive When the GET line has a mark in it, then the GET request is also passing information to the server. So in this case the client is requesting the file examplephp but it?s also passing along a value that could have been entered by the user. URL Lines When there is a mark in the URL line, then KEYSCORE is breaking it up into two parts. The first part is called the URL Path and the second part is called the URL Ar ument. FCL 511: LIE l? f5?? El "3 tE-l r?th IZIJE Ll E: r'EIf-El. at :31 rt=El ELSE: IZI Ll Notice all of the ?arguments? (each separated by 863) in this RL1:11-2:11 I: at" ta: E: I: ?111.22 1 11 try: ear-:11. tub-2. 1.115 aata: ccept-Enc :uzlizw: :2 defl ate _eer?J?Lgerut: BUHUS Any idea What the seaer-l-t-hc-r-z-I-e infermatien that is being in the RL Argument In this example are far? Eli-ful?l: ti on: Heep?it; ive E- E: 5 4'5 HTTP Client-to-Server Hoet: earnlewebeiteoom User-Agent: i'u?lozillai'?? {?Ju'indowe; Windows NT 5.1; nni??gi?i Gecko}? 9342315 Firefoiti'?tifi? Accept: Accept-Language: Accept?Encoding: gzipoeflate Aooept?Chareet: Heep-Alive: EDD Connection: The User-Agent line gives you information on what type of client is requesting the data. In this case, we can see that it was a Firefox 3.0 browser from a Windows NT 5.1 (XP) machine. User Agents User Agents The UserAgent (also known as the ?browser?) can be very valuable. While it can not be trusted to be absolutely unique. in many cases you can use it to unwind a proxy or multi-user environment. It can also help provide hints if the origins of the request came from a mobile device: -Ll Ill-Ls-s?uI?Z-J. Cl l?jgrIrttirLuItC-l??ii. '3 I.:Iir_ 2: It} [.153 1 :1 El 1. 11': 52. at. hl-LI: L'itdiu 1 Us or 1-.11' all"? if: if; .4. 1 :I'io t5 . El PI 51 oils-IE? - I: I'i?gut' a ti 11-" - 1 . 1 Us or- ?agrant: Elton: 1] HTTP Client-to-Server GET themehtml Hest: samplewebsiteccm User-Agent: Fu?lezillsf?? {?Ju'ihdews; NT 5.1; err-US; 25' Accept: Accept-Language: Accept?Enccding: gzip.deflate Heep-Alive: EDD Cennecticn: Keep-slice The various ?Accept? lines instruct the server on the types of responses the client can accept back. Let?s look at a simplified version of a HTTP request and response What is Web (HTTP) Activity Thie ehewe hew pereen legs en te webpege Fi?m 3434* Click en TD 3?3 (client) GET Request (SEWEF) The elient?e pert can be Ell'l'f high-numbered pen, 3-434 is just an example What is Web (HTTP) Activity This shows hew a person legs en to webpege PW 3434* Click en http:iiww.hntmaii.cnm 80 (client) GET Request (sewer) 4 it Frem Pert 8D ?Ft 3 3?4 ?Weleemete Hetmeil? (Sewer) (3 Response The elierit?e pert can be any high-numbered pert, 3434 is just an example What is Web (HTTP) Activity This shows how a person legs en to webpege Fr'?m PW 3434* Click en 5?3 (client) GET Request {Sewer} TD lam-t 3434* H: Frem Pert 8D (client) Weleeme tp Hptmell (sewer) Respense {client} EmeilAddress: me@hetmeil.spm (Sewer) Password: Admini 23 POST tn the Web server The client?s pert can be Eil'i'f high-numbered pen, 3-43-4 is just an example What is Web (HTTP) Activity This shows hew a person legs en to webpege Fi?m 3434* Click eh TD 5?3 (client) GET Reque?t (Server) TUI P?rt me P?rt (client) ?Weleeme tp Hptmeil? (sewer) Response {Client} EmeilAddress: me@hetmeil.spm (Sewer) Password: Admini 23 POST tn the Web server Te Port 3434* Frem Part 80 "Weleeme te yeur Inbe?hemepage? HTTP Respense The client's pert can be any high-numbered part, 3434 is just an example HTTP Activity i Real traffic, however, can be a little more complicated. i- Almost all web pages are built from mumme?bs For example, every single image or banner ad on a web page is a separate file that needs to be individually requested before the server that has the file can respond HTTP Activity Real World Let?s look at the Today? home page. r'r'i'Current Eund?lans "Ifl: i'r'I'I 110 . nap-5 "m Ln? Seaml- HearthII?'ll l-xf I Hi Hanna 5: ED: 3 HERE the Senler Enlleted Leader -- Il-Jl. II.- I I. . I - Hmardc? ._sj Lkutad 1 rurLIL-I L'Ih'u'rb'?' 4 . . Ihi-s. Il'us- ?r-ISEIDH 1 at HP hr?. r.r rTr.I_. . =ur' Tl I cu: cur Heart-inane- ape-astian ansZIZIIZI Lift;- HTTP Activity Real World - It looks like one page, but each of the images and banners are separate data files that your browser pieces back $11-th1] HEAG hosts th 3 i-.l "ram-1r Enlisted Lea-jar I - .Fri'ri; I'i=u' 1I'ii{fag-l 'l2-ll. II l' ll.- I. "'J'Ilr'llr'l.'II'II' FF: hT] . 'I'Il'hl h'l' . . --. -- II duh? I I Irl-'I Tran: I I -.-J-: -: to Tl'n:: t'rz?. :?cp . .- High? h'h'm'lgmassu L: . .r L: ?h :1 151' I I ll_.:hrl I I HTTP Activity Real World In fact, to build the NSA Today home page it takes 34 separate files from 4 different servers 1* However, most people probany don?t notice, because the entire page loads in <300 milliseconds. i If we had a slow internet connection, we'd notice the images would initially be missing. TI [locum-ant lnfannati-J-n l'I-l'll 7131a 'Jl-ll Frwial 3? ll'l'l'F' Ila ails-I Il1l?fl1l?ti?-l1 HTTP Activity Real-Word Netiee that all at the images are missing. They are all separate sewer-te-elient respenses and therefare completely separate ?sessions? in X-KEYSCORE er PINWALE SH Wises VI 2-.sntr Paar Ira: Ltd-die East :julesltrie Twist": 25:13:31: .3an .arhra Jui'aE Suzi-Teal: eszlJ laud ever 11:41 Kuwait W?F?igl'li" er tut-*1. iI" Mar eil'll is I'Llu" Ilili? i=1 :ul Ellen's Lisml Ff-t'nrf: smarter-11:1; The Huwaltl has In: resign-slim! the EUL-ll'll'f'b urnir anti? :1 the premier?s han efthe economic nrisls. Il's- resignazijl' has teen suslnitted Fc'n'ia and: it Lp the {ruler} t: :eJ' nil-itnr. ?mnn? .1 nr-r .?imr-nr "man, Ts? Thl-?L?iac?l dirt 1::1 Jueldmw I :3 - Fri-2:. s: Uri-.ls gr. an- rnugrnl rn U-F 'n.lrl tirlh 1r r.an lhr: 1r-nrr ml 'it nil nn 'Jlf'lu'l'r. E..11 rasese paekage Wl' sh '5 LI tn the Pars :Irl Gull 'IstiL'r e: the impact :nft'ie ; ln:ul:al "Ii'isncis an- gr arunn rat WT ?rm *nnr'rrl nn rh *r IE I. set: ?51125; tea-l; [ht-ermqu was; 312 mat? HTTP Activity Real World 6 It?s important to note that not all of the data on one web-page came from the same server. i For example, most of the NSA Today home page come from homewwnsa, but the image of the current weather conditions came from wk- admiral208.corp.nsa.io.gov HTTP Activity Real World at This happens all the time on the Internet. i The cnn.com home page, may have an ad on it that was from the Google ad server and etc. i And this does have an impact on our COHec?onl i This is the traffic path for building the NSA today home page I I .1 I a nail heme warm nsa eerpwem .naa aitewerkanea win:? admiralE?B.eerp. neaiegm Ueer i What happens if we only have collection on one of the paths? I I .1 I a nail heme warm nsa .naa aitewerkenaa win:? admiralE?B.eerp. neaJegm Ueer What would that traffic look like? GET Heat: wit?admiraIEDE.eorpneajegev Uaer?Agent: Mozillai?? {Windewe; Windowe NT 5.1; en?US; W11 .Q?fi?i 25} Firefoa-Li3?fi Accept: Accept-Language: Aeoept-Eneoding: g?p?eflate Accept?Chareet: Heep?Alive: BUD Connection: Heep-alive lf?Modi?ed?Sinee: Thur DE Get 2009 19:31:53 GMT lf-No "?19454 Ee?i -842th43" Cache-Control: max-age=? If we only saw this one GET request and not the other 33 required to build the NSA Today home page, would we be able to determine what the user was actually doing? What exactly is that telling us? in First off, we know what file they are requesting. want current.ij from the wk- admiral208.corp.nsa.ic.gov server. i*Thatis actually a live public URL It Do we have any indication why they wanted that image? Answer is yes! Look at the referer field. What exactly is that telling us? It They were referred from i?The referer is in essence, telling you what site was ?linking? to the new site. It Warning! The referer can act in misleading ways. Referer Field in The referer field is the address of the page that links to new GET request. However, this link could have been automatic to the user. l.e. in the case of the current weather image, the link was automatic and the user wasn?t even aware of the action Referer Field i The referer field could also indicate a user ac?on. For example, imagine we were on the NSA Today webpage and clicked the link to the SID Today page. What would that traffic look like? Referer Field Hust: sidtndaynsa user-Agent: {Wind-zlws; NT 5.1; err-US; Gaul-(0200 904231 6 Firefcux? .U. 1 0 Accept: Accept-Language: Accept-Encnding: gzip?eflate Accept-Charset: ISO-88594 Keep?Alive: Cunnectinn: keen-alive Referer: http?hnmawwmsaf G??kie: =66534?96; b?463444f72496d?f523 35tvi5it Referer Field in Now we?re seeing a request go to host ?sidtodaynsa? with the referer from 1* How can we tell from the traffic that the first automatic referer we saw for the current weather was any different from the user- generated referer we saw for the SID Today article? Cookies! Cookies Cookies are small pieces of text-based data stored on your machine by your web browser. I Almost all websites have cookies enabled and they have a variety of uses, including to help the web-site track the activities of their users. i Most are probably familiar with ?machine specific cookies? like the Yahoo cookie - However cookies are used for a variety of reasons What can cookies be used for? - Cookies can be used to authenticate a user. For example in many cases, the ?active user? for Yahoo web-mail traffic is seen encoded in the I: part of the cookie stringElt' t: .1110 c. In gill in1.5.11 ante-1H: 1151i:le i' 1.22 tummy: Unite-:1 States What can cookies be used for? - Cookies can be used to store information about the user that the website is interseted in Look at how the p= value below tells the website information about the user of this account I: 1.5.11 cunts-1H: 1151i:le i' Units-:1 States 'i What can cookies be used for? Cookies can be used to identify a single machine from hundreds of other users on the same proxy IP address The Yahoo cookie is a ?machine specific cookie? What can cookies be used for? it Important note: All three of those examples are just subsets of the full Yahoo cookie string HOW do we EHOW wlla! 980? COOEIG value is used for? Nearly every web-site uses cookies that in most cases they designed for their own uses, so how do we know what they all mean? Protocol Exploitation can examine the traffic to try to determine if there is any information contained in cookie strings that we might be interested, for example we?d like to know if any part of the cookie acts like a ?machine specific cookie." HOW go we EHOW wlla! 980? COOEIG value is used for? However, there are far more cookie options out in the wild than PE can possible examine. even if they aren?t aware of a machine specific cookie, it doesn?t mean that it doesn?t exist. X-KEYSCORE gives you access to the full cookie string, so if you?re adventurous enough you can do your own protocol exploitation. Remember: Cookies are there for a reason! it Websites put cookies on people?s computers for a reason. If the data is valuable for a website, it may be valuable to us as well. How long do cookies live for? LiI-Cookiesi like any other file on a computer, can be deleted by the user. Almost all browsers give you the option to View, manage and delete your cookies .12} ?nal-Irina E1 Cookies Yeu can see whet have been etered en yeur machine by geing inte the ?eptiene? windew ef yeur breweer and selecting ?shew ein ?e tE nt Iv'l ul. F_Ener:ner I alter in [ern'E anzl the search bar I- I: l:l l'_l age-a cod-isles ir-: 'n sires Earn-em L-zlzegt third-3e 43:; Eeep expire l-I I ("leer Li- I 34;: Searches Searching the Internet When a user searches the Internet from one of the many web-based search engines (Google, Bing, etc.) what does the traffic look like? Searching the Internet: CIient-te-Server In most cases, the client-to-server traffic is a GET request where the search term is passed in the URL Arguments: GET Host: wgooglecom Accept: imagei?gif. imagefx-xoitlnac, imagefjpeg, imagefpjceg, applicaticnr?undme-powerpcint: applicationfvno.me-excel, applicaticni?meworo, Cookie: 4:3: neMSEIEtfc?er? }(Ehpti?ri?o Accept-Encoding: gzip: deflate User?Agent: Mozillai4.? (compatible; MSIE 5.0; Windows NT 5.1} Connection: Heep-Alive Cache?Control: no?cache Searching the lnternet: CIient?to?Server - Notice how the URL Path is lsearch and one part of the URL argument is q=iran Each website can configure their differently, so while with Google the search term is contained in the q= part of the URL, a different search form might have it as query: or search_term= etc. Searching the Internet: CIient-to-Server X-KEYSCORE tries to account for all the variations of search terms contained in the URL Argument for what it extracts for the ?Search Term? column. 1* However, there are always other varieties out there that we haven?t built it hooks for yet, so anytime you see something that you think should be extracted, please contact the team ?Referer Searches? it What happens when a user on a search result? Let's start by showing the query itself, in this example, we're going to query the Google for ?Referer Searches? What does that GET request took like? GET Heet: geegle4.q.nee User-Agent: Mezillat?? [Windeweg Windewe NT 5.1; en-US; GeeketE?DQ?-?i?t? Firefexi??ji} Accept?Language: gzipee?ete Accent?Ghereet: Heep-Alive: EDD Cenneetien: keep?alive We knew frem this eeseien that the client is requesting the data frem the heet ?geegle4.q.nea? and we see the search term in the URL Arg ument ?Referer Searches? What happens when a user clicks on a search result? GET Irecln'line Heet: User-Agent: Mozillaf?? (Windewe; Windewe NT 5.1; en-US; GeekefE?DQ?-?t?t? Accept: Accept?Language: Accept-Eneeding: gzip?e?ate Accept?Ghereet: Heep-Alive: Keep?alive (Jackie: eE-fa W421 Referer: First, we can determine the full URL I: Egg by adding the GET line to the h?st .r1 3% ?Referer Searches? i Secondly, we get some hints as to why the user was requesting that page from the Referer line: Referer: Note that it was the same URL that we were at immediately before we clicked the ?result? link ?Referer Searches? i- Let?s look at that process again: gnagls4.q.nsa First, a client-tn- server request ls sent that cantains the queryr an ?Referer Searches? i- Let?s look at that process again: gnagls4qnsa Second, the server hack w?h the search results Ll Ir ?Referer Searches? Let?s look at that process again: Ih? geegleeqnse I:li]iilil Ikeyecere. r1 nee Third, by elieking en one of the results; a newr GET request is issued to retrieve the heme page. In this request, the location of the original search is listed as the ?referer? ?Referer Searches? In Let?s look at that process again: goog e4q nea I:lilillil nea What will happen if we only have collection on this link? ?Referer Searches? When XKEYSCORE sees a search contained in the ?referer? field, we still extract it out as meta-data into the ?search terms" but we append it with (referer) to denote where it was originally foundIrefererJI-tlle legal Statue oftlle caspi?n 513:] LIFIIL F'Iritl'l F-ief er r' ue= ?Referer Searches? GET Hexadeaepian_etatue.htn1l Accept: Heed: Refee r: {Jog I+elatue+ef+the+ caepian+ee Accept-Le nguege.? fa Aenept-Enendi ng: gain, efl ate User-Agent: ?enmpa?hle; MEIE Fi?: Win-dewe NT 5.1: SE1: .HET ELF: ntrel: mer?etele=? Connection elese I-BIueCnet-We: Can we guess what happened here? Referer searches Another example: Til-F" 5 HUD-GET 1-: Flewdl'; lw'naraiuzm HEWDEH ENIFurrnat Seminar. 1? WT mm: TWH USE'J??gtl'?i Ma?a-?int Wind-3w; HI 1; :li?Ufjjl 131pch :ltl'lr?lL: 111:1: Herlmj Sheree-'1 Swim-"JR": FIE-war ?aw-nan mam manual:an ?int-apt m1 '5 fizz-:pt Emmi?Lg; gajp?u?ntc?bziplsich Ciel-Lie: 113:: 113:: ??n?pt-Tanglag?' F?-U?len fizz-:pt Chemist: ISO 3 Heat: mew. 5111;315:311 cc-m Czarzlt'tli'rn Keep-FIJin Proxy Information Proxy Information In a lot of cases we?re going to see HTTP Activity from behind a proxy or proxies. What is a proxy? . A proxy is a server that is acting as an intermediary for HTTP requests from clients Why do proxies exists? - Performance: Proxy can cache responses for static pages - Censorship: Proxy can filter traffic - Security: Proxy can look for malware - Access-Control: Proxy can control access to restricted content Proxy Information Routinely, we?re going to see ISP level proxies. That is, instead of having each individual user request web pages directly from the web servers, the ISP is going to collect all of those requests first, and then proxy them out through a handful of proxy IP addresses. When the response is returned, the proxy passes it on to the appriopriate user Proxy Information in Why would the ISP want to proxy traffic? In many cases the ISP won?t have to supply public IP addresses to all its users It can simply give them a private IP address, and then use a handful of public IP addresses for its proxies which are the machines actually requesting the traffic from the web-servers Proxies on the Internet . Single-ueer Web-Sewer 5 I15 . l] Web?Servers Web?Servers Stuart-live Ill tennectiens LID ng-Iiued Multiple-ueere multiplexed Multiple-ueere multiplexed - Ii? Ww-te-Prexy :l =l Identifying a Proxy it How do you know that the IP address that you think is your target is really a proxy? First step, check NKB. They have services that attempt* to automatically detect proxies These services are in no way 100% accurate so this is only the first step in checking to see if the IP Address is a proxy Identifying a Proxy: NKB Query-I: Addreas ?ute: 'I'alue Canticlan Luau . . ?ning: fujurd I - r'I' :Iui?lltl hunt an El I I2: Thur-:1:- I Identifying a Proxy Other things to be on the look out for: X?Forwarded-For IP Address . What is it? . An X?Forwarded-For IP address the proxy passing on to the server what it thinks is the IP address of the user . Think of it as the proxy telling the server ?this is who I think this request came from? . It?s important to note that multiple proxies can, and often, are present, so one proxy mightjust be reporting the IP address of another proxy Identifying a Proxy - X-Forwarded-For IP Address as seen in traffic: GET 1. Etf Jig-2m: [?zz-imp: atible; L-IEZIE. E33. Ell; FIT 5. 1; 3171:} st: 513;- I: tagnet. I: ?3111 a: I: . I-Ill .- i:1 41M: tLizL-?i cigl Some Examples of X-Forwarded-For headers: K?Fer'ward eel-Fer: tt-Femrarded-Fer: li-Femrarded-Fer: K?Fenrrarded-Fer: X?Femrarded?Fer: Iii-Femrarcieci-Fer:? K-Femrarded-Far: 12100.1. K-Femrarcieci-Fer: ganglanant? Multiple-Layers of Pruxiesi ln-general, the first IP is the one closet to the original requester Keep in mind these can be tetally fake Identifying a Proxy Similar to the X-Forwarded-For Tag is the tag? The VIA tag is the proxy identify itself GET II . er rut: Ht: 3t: m1:- 1-.-IEJIE. r5. ill; 121']? 5. EFF [lb EIF . . STEELE-1 [21-week er: 1 - i 55!. ?35. 1'35 E1 I: 11-: - r11 - ale. E: :1 [ti I: 1111:: than: 1-3: EFI 'EJi-rlirlf: Identifying a Proxy The Via: tag may even contain some good information about the proxy Be careful though because this information could be falsified: . to w: '32: 1 El Evil-ii" 5. LEE-LE 1 :l Identifying a Proxy Remember though that the X- Forwarded-For and VIA lines can be falsified and don?t have to be present! Ifthey?re not present, how can you tell the IP address is a proxy? Test it in Testing IP Addresses in MARINA The primary side effect of a proxy is too many users cnline at the same time So if all else fails, try querying on the IP address (assuming its compliant of course!) in MARINA to see how many users were active within an hour time frame It It?s not scientific but generally it will help Testing IP Addresses in MARINA For example look at these results: :59: 11:5? 1131-: lime: in [:51 1 I125. n- r: {an [Eli-1: It- mm virgin-.12 i an]: -..-. an 3mm 41] Um: by. EtrurIL: .1 :1.il I that. r'r the RE 1- it] :12: [Pili- There were 274 unique ?Active Users? in that hour, think it?s a proxy? HTTP Header Fingerprint (HHFP) What is the GCHQ created the HHFP to help identify individual users behind a single proxy IP address r. The HHFP is a hash of multiple header ?elds that can be used to identify a single user behind a proxy What is the It At least one of these values must be present: . X-Fomrarded-Fer IP Address . Via . Client IP address Ifso, the HHFP is a hash of those values combined with the User Agent string .3 ms was; r313. sis lg 1393 5:14 '1 [lbs '2 It] 5 1 Era-:5. EX: Here?s an Iranian proxy IP Address that has multiple underneath it. '1 I:tl:t 2323:135- Era-:3. -- NOTE: There?s no guarantee that an HHFP is identifying a single 135.7323 '1 unique user, it?s entirely possible that more than one user will have the same HHFP .j 13:1 13 at :l [31 j] Eris a 4513i 4 rs [f1] 54?- _i lg sen-1 a as :11 :Elh ?3 :3 :3 a 2 [fl sl'l'i tnF.? 4 I,j1j, 2 E: j] Pros and Cons of HHFP I On the positive side, the HHFP is a single 8 digit value which can help identify a single user behind a proxy On the negative side, it requires an XFF IP address, Via string or Client IP Address and since many sessions do not contain all three, they?ll have no HHFP string I Also even with the HHFP, all of the fields that are used to build it are available in the XKS HTTP Activity query so it?s not providing you with any data you don?t already have access to HTTP Activity Search XKS HTTP Activity Search After that overview of how HTTP Activity works, let?s look into how to effectively target it through XKS queries XKS HTTP Activity Search HTTP Activity indexes every HTTP session i Client-to-seryer and server-to?client i Can be queried on any of the unique HTTP meta-data fields or any of the ?standard? DNI fields (IP Address, SIGAD, CASENOTATION etc). XKS HTTP Activity Search - Unique Meta-data fields of this search Include: . . . .. cavered In training: ?rm-a: FEIFB IZISIZ: 2-1 Fn?: URI. Fail-I: .H. 5933:?. Tarn-a: CCICHCIEII Wan: ?ttauzl'rn-al'lt Filal'lal'na: T515 H=rw=ur ?ynn: CharaxerEnId - MEI: l??j CE FtElrt E1: _ink5 Ilnrta-nl: TI-al: XKS HTTP Activity Search - In addition to all of the common fields like: ?nnlira?ri?n' [Perlrl?E?S Fm? Apploatlo'In?ro: lF' ?dd'e55 - - To .Eltjlication: - - rlgr'. ?rml'l l?r'r'l' 'Tn v Counth 1* 2 de Ham gaggi?l?lLEl'l'Jth CIt?y? n3; elm-.3. DUES BMW: ll'l :l ll.JT"Ij. XKS HTTP Activity Search Most commonly HTTP Activity query searches in XKS will be to enable ?persona analysis? Based on MARINA, TRAFFICTHIEF or PINWALE, we?ll want to query XKS to discover all of the HTTP Activity that occurred around the targets session of interest Simple HTTP Searches In order to do a ?persona analysis? type search, all we?ll need to fill in is the IP of the target (assuming it?s compliant) and a short time range ?around? the time of the activity: atetim E: I21. E: 'n XKS HTTP Activity Search Another common query is who want to see all traffic from a given IP address (or IP addresses) to a specific website. XKS HTTP Activity Search i For example let's say we want to see all traffic from IP Address 1.2.3.4 to the website i While we can just put the IP address and the ?host? into the search form, remember what we saw before about the various host names for a given website Host Field It?s important to note, that in many oases users think they?re at websites like but behind the scenes data is coming from a number of different servers without the user knowing it: E: J's-1' El] 3? El] 7" 334.": 1-. . 2' Jul .I5. I F'hJ-i'th: rt. 5 mail u. W: I _1 3' tits-7.711713 IF. 1 - r" Er?u'l?Tn I1- air. 23.1. .. :Tttat??-i?E: a with: Ft at; 3- Er. gag;- . dc??l: 5. HT $2773.71- I: I . -. I--I 3 '1 11:1 5 515 l: Y?l'tnlil gill . Gentle-r: malla-i Eirtlt year: Postal 4 fit-LE 31-1-12; XKS HTTP Activity Search i In order to account for all of the possible host names, we must front-wildcard the host name. i Be careful when front-wildcarding because beyond being resource intensive for XKS, it can be dangerous from a perspective Hints for wildcarding a host name i If you?re trying to query for traffic to the website the best way to wildoard it is: i *.website.oom i Notice that the . before the hostname website is still there, that way we will properly hit on ads.website.com images.website.oom but avoid the false hits on Hints for wildcarding a host name Why are we only interested in traffic coming from our IP of interest going to our website of interest? Helpful GUI Shortcuts - Earlier we talked about how XKS broke 3 GET request into the URL Path and URL Argument (separated by a Ex: http:f!farum. Get?a broken out to: HIZI Eft LI FCL Ell l'l E15: f-trrum? EelIcr'n'lltraathtlui #1314515 Helpful GUI Shortcuts - So if we were to query for this URL we would need to enter those fields in separately: HIZI :31 URL F'Eltl'l ?erhiwthremlquur #131435 lluru r' Helpful GUI Shortcuts Or we could use the Field Builder? to simply copy and paste the full URL and let XKS break it into its appropriate parts: I: 1 URL F'Eilil'l Field Builder UIIL llL? [lurk-L11 I.IJ [.mpulull: Ina-L, path, and argument fields: :ritar Helpful GUI Shortcuts Field Euider Enter a URL that be automatically par5ec tn pupulate the host. path. and argument fields: Writ] Ell: . 1:314:35 El I_:el at LII-7.1- atl': [rt-1:1. re F-d.