Is it possible to monitor and redirect HTTP requests system-wide with golang / libs ?

agolangf · · 708 次点击    
这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
<p>INB4 browser plugins - I am a webdev and could make such a thing but i want to tackle this issue at a system level so i dont have to test, manage and package for 5 different browsers</p> <p>I&#39;m trying to make a system whitelist app for my OSX machine ( and hopefully also android ) which regexes browser traffic and optionally redirects it. Somebody mentioned using golang. Can someone offer me any examples of monitoring http requests in redirecting in golang or in general? Thankyou :D</p> <hr/>**评论:**<br/><br/>TwilightTwinkie: <pre><p>This is basically a proxy. You should checkout the httputil package. It may not do exactly what you are after and you may need to write a lot of custom code. It&#39;s a good place to start though.</p> <p><a href="https://golang.org/pkg/net/http/httputil/" rel="nofollow">https://golang.org/pkg/net/http/httputil/</a></p></pre>Nitrodist: <pre><p>There&#39;s s tool called mitmproxy. It&#39;s an exact use case for this. The tricky part is ssl, but you would have that problem with go anyway. </p> <p>If you&#39;re interested in more, I can paste you a snippet that does kind of what you want. </p></pre>basiclaser: <pre><p>Yes please!</p></pre>Nitrodist: <pre><p>Here&#39;s a simple script to rewrite <a href="/r/soccer" rel="nofollow">/r/soccer</a> to <a href="/r/nfl" rel="nofollow">/r/nfl</a> -- let me know if you have any questions:</p> <pre><code># This file is called &#39;rewrite_req.py&#39; # GUI (ncurses) version: mitmproxy -s rewrite_req.py -p 8080 # Non-visual version: mitmdump -s rewrite_req.py -p 8080 import os import re def request(context, flow): # Need to debug? Run this file with mitmdump instead of mitmproxy and use the line below to drop to a debugger # import pdb; pdb.set_trace() # &#39;http://reddit.com/r/soccer&#39; request_url = flow.request.get_url() soccer_regex = r&#39;soccer&#39; if re.search(soccer_regex, request_url): # replace /r/soccer with /r/nfl new_request_url = re.sub(soccer_regex, &#39;nfl&#39;, request_url) flow.request.set_url(new_request_url) </code></pre></pre>p4r14h: <pre><p>Obviously the language doesn&#39;t matter. For the system level, you want to use a proxy (<a href="https://support.apple.com/kb/PH18553" rel="nofollow">OSX</a>, <a href="http://stackoverflow.com/questions/21068905/how-to-change-proxy-settings-in-android-especially-in-chrome" rel="nofollow">Android</a>) to redirect traffic to your server (where you can then run something like <a href="https://github.com/elazarl/goproxy" rel="nofollow">https://github.com/elazarl/goproxy</a>) and redirect traffic based on whatever logic you want. Not sure about SSL, but it&#39;ll probably be complicated.</p> <p>Edit: Also, checkout <a href="https://code.google.com/p/middler/" rel="nofollow">middler</a>, as it might be of some interest if you don&#39;t want to host a proxy. Also, if you&#39;re looking to get this done for non-academic reasons, I can write a custom proxy for you in about a day- for money of course.</p> <p>Edit 2: Even Middler uses a proxy: <a href="https://code.google.com/p/middler/source/browse/trunk/libmiddler/proxies/http/http_proxy.py" rel="nofollow">https://code.google.com/p/middler/source/browse/trunk/libmiddler/proxies/http/http_proxy.py</a></p></pre>basiclaser: <pre><p>Thanks, the obviousness was unobvious as I usually never leave a browser. Can you think of any subtler benefit to using go, or another lang? </p></pre>p4r14h: <pre><p>I wasn&#39;t being flippant, by the way. Regardless, Go, Python, Node all will work as simple HTTP servers. The benefits of Go are really purely subjective. It&#39;s a compiled language with a strong core, a sane dependency management system and lower abstractions. Plus, GC.</p> <p>Without modifying the packets on the wire (which isn&#39;t easily done with any type of sys call, see the middler link), you&#39;re really just using the supported proxy settings for whatever platform you want to target. If you do want to modify another processes memory, then you&#39;ll basically need to figure out which osx process corresponds to the IP stack, find out where in the memory it buffers network packets and then use GDB or vmread/vmwrite to modify it.</p> <p>Try these references: <a href="https://books.google.com/books?id=KjX57QfgwZkC" rel="nofollow">https://books.google.com/books?id=KjX57QfgwZkC</a>, <a href="http://stackoverflow.com/questions/10668/reading-other-process-memory-in-os-x" rel="nofollow">http://stackoverflow.com/questions/10668/reading-other-process-memory-in-os-x</a> </p></pre>sethammons: <pre><p>You could look at this I wrote as a simple http debugging proxy. It can highjack requests and send back specific content. <a href="http://www.github.com/sethgrid/fakettp" rel="nofollow">www.github.com/sethgrid/fakettp</a></p></pre>TheMerovius: <pre><p>Due to HSTS and certificate stapling and the like it&#39;ll be pretty much impossible to use HTTPS with something like this (and, hopefully soon, pretty much all web traffic will be HTTPS). This isn&#39;t really solvable, unless you do the interception at the endpoint of the connection, i.e. in the browser.</p></pre>basiclaser: <pre><p>Thanks for the critical look at the idea. To rephrase my requirements, I want to &#39;kill&#39; requests to certain URLS, I don&#39;t necessarily need to redirect. So could I do this without a proxy(thus avoiding HTTPS impossibilities)? How do you feel about p4r14h&#39;s solution of modifying the IP stack in memory?</p></pre>TheMerovius: <pre><blockquote> <p>To rephrase my requirements, I want to &#39;kill&#39; requests to certain URLS, I don&#39;t necessarily need to redirect. So could I do this without a proxy(thus avoiding HTTPS impossibilities)?</p> </blockquote> <p>Depends a bit on the granularity of &#34;url&#34;.</p> <ul> <li>You could block certain Hostnames (so &#34;everything on youporn.com&#34; for example) by overwriting/changing DNS entries.</li> <li>You could block certain IPs (so &#34;everything that is hosted on that server&#34;) by using firewall rules. The effects of this are harder to know, because a) the same page could be hosted on a different IP too and b) there might be other pages that are hosted on the same IP.</li> <li>If you want to have more granularity (e.g. &#34;block this particular reddit post&#34;) you need to inspect the HTTP headers and you are out of luck in regards to HTTPS.</li> </ul> <p>Basically, what you want to do is what countries like the UK are doing and which conceptually can&#39;t really work without unwanted side-effects. To understand this, you need some basic understanding of how the web works: When you request a webpage in your browser</p> <ol> <li>Your browser looks up the domain name in the DNS, to find the IP address that is serving the page</li> <li>It connects to the IP address on TCP port 80/443 (HTTP or HTTPS respectively)</li> <li>If you are using HTTPS, it sets up an encrypted tunnel over the TCP connection and uses that.</li> <li>It sends an HTTP-request with the full URL of the resource it wants over the connection and gets an answer.</li> </ol> <p>The problem is, that only at 4 do you know the specific resource requested and for that step 3 needs to succeed and in the case of HTTPS you can&#39;t inspect the contents from outside of the endpoint. So you either have to rewrite stuff before 3 without the specific resource (which means you probably block too much) or you can&#39;t block HTTPS requests.</p> <blockquote> <p>How do you feel about p4r14h&#39;s solution of modifying the IP stack in memory?</p> </blockquote> <p>The IP stack isn&#39;t the point you need (the same argument as above applies, IP is the layer at 2), you can do that simpler by using firewall-rules. What you might want to inspect and manipulate is the memory space of the browser (because that&#39;s the HTTPS endpoint) and that&#39;s just crazy -- not only is that even more browser-specific than an extension, it&#39;s also far more complicated.</p> <p>So there you have it. If you want to be able to also manipulate HTTPS requests, for example because the granularity of hostnames or hosts isn&#39;t specific enough, you definitely need to have stuff running in the process of the browser which means you want a plugin or extension. I don&#39;t see any way around that.</p></pre>

入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889

708 次点击  
加入收藏 微博
暂无回复
添加一条新回复 (您需要 登录 后才能回复 没有账号 ?)
  • 请尽量让自己的回复能够对别人有帮助
  • 支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
  • 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
  • 图片支持拖拽、截图粘贴等方式上传