Dynamic Generation of Golang from SQL Stored Procedures

blov · · 705 次点击    
这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
<p>Hi everyone, I should preface this with the fact that I am brand new to programming and Golang but have a lot of experience with SQL. High level, we have a system that uses thousands of SQL stored procedures (Oracle aka PL/SQL) to generate &#34;filters&#34; used to run a front end. So process looks like this:</p> <p> </p> <p>1) Millions of rows of data are staged into Oracle Server.<br/> 2) Stored procedures are run across staged data, filtering it.<br/> 3) Results are imported into individual tables (one for each &#34;filter&#34;)</p> <p> </p> <p>The problem is this process takes hours and has become a crux for the organization (currently paying a DBA major money to re-write the most taxing stored procs). Our company has recently decided that GoLang will be the new/core development language for IT. All existing jobs (mostly COBAL) will be rewritten and all new jobs will be written in GoLang.<br/>  </p> <p>So here is my idea (which I&#39;m not sure is a good one, but thought it might be a first project to start learning on)... The goal would be to take the WHERE clause of the PL/SQL script and run it through a GO script that turns it into a GO function... For Example,  </p> <p>WHERE Column_X IN (&#39;A&#39;,&#39;B&#39;) AND Column_Y = &#39;C&#39;<br/>  </p> <p>would turn into something like:<br/> Func Filter1 (COLUMN_X string) bool {  </p> <pre><code>switch COLUMN_X { case &#34;A&#34;, &#34;B&#34;: </code></pre> <p>  return true</p> <p>  }  <br/> return false   }   </p> <p>Obviously that leaves out the AND Column_Y = &#39;C&#39; part (I need to learn how to do that), but do you see what I am trying to do? Basically I want to build a script that can Intake a SQL WHERE statement, iterate through it and find all of the filtering criteria, and then dynamically turn that into GoLang Functions. If the script did it correctly, once completely, our 1,000s of stored procedures could be automatically turned into executable GoLang scripts without having to rewrite any of the stored procedures one at a time. </p> <p>  </p> <p>Is this a dumb/bad idea? Goal would be that once the millions of data rows are staged in Oracle, you quickly run them through the Golang script which has had each of the filters converted into Go functions. Each function returns a &#39;Yes&#39; this item is on this filter, or no this item is excluded from this filter. We can use those YES/NO values to partition the data. So instead of </p> <p>  WHERE Column_X IN (&#39;A&#39;,&#39;B&#39;) AND Column_Y = &#39;C&#39; </p> <p>  You have </p> <p>  WHERE PARTITION_1 = &#39;YES&#39;  </p> <p>This would allow us to process all of the &#34;filters&#34; via servers running GoLang instead of directly on the Oracle Server. The Oracle Server would only run and fetch partitioned data, and not have to perform the Stored Procedure filtering and staging itself. Just curious if anyone has tried to accomplish something like that before, or if it is a really bad idea. Thanks in advance!! (also I am taking my first formal GoLang course soon so sorry for the terrible pseudo code above). </p> <hr/>**评论:**<br/><br/>djherbis: <pre><p>I actually had to do something somewhat similar for Java, essentially:</p> <ul> <li>parse an SQL-like language into an expression tree</li> <li>build and compose predicate objects from the expression tree recursively (or alternatively, generate code that evaluates the predicates)</li> <li>map column names to values of row objects, apply predicates to the mapped row data</li> </ul> <p>If you <a href="https://www.google.com/search?q=golang+sql+parser&amp;oq=golang+sql+parser&amp;aqs=chrome.0.0l4j69i64.1647j0j7&amp;sourceid=chrome&amp;ie=UTF-8" rel="nofollow">Google for sql parsers</a> in Go there looks like the first parse might already be done for you.</p> <p>That all being said, it&#39;s still probably quite challenging for someone new to Go, and more importantly, what leads you to believe that Go performing the filtering would be faster than Oracle doing so? Databases are usually impressively fast at filtering data (especially when indexed correctly). I would think that if it takes hours to process, the solution isn&#39;t just to rewrite it in Go, it would be to investigate where the bottlenecks are and solve those, that&#39;s hopefully what the DBA would be doing. Switching languages can sometimes lead to performance boosts, but good systems design is more important when talking about scalability.</p></pre>PaluMacil: <pre><p>Since the most expensive part of this calculation is the communication between the database and application server, the fact that Go is a fast language might not help you save time. Oracle is very good at set wise calculations. You can certainly write very slow SQL queries by splitting them up into intermediate tables and not understanding the query plan, ignoring indexes, etc, but SQL is a declarative language where you describe an end result and get the full power of many many Oracle developers who have already optimized a way for you to get the results you described through constraints indexes and all sorts of clever caching. The way you are looking to do it would be imperative in nature, and while the code would do exactly what you demanded, there could be clever ways of optimizing it that a datastore such as Oracle would have been able to provide transparently.</p></pre>singron: <pre><p>This is possible, but if you haven&#39;t written a compiler before it might be difficult. Also it&#39;s dubious whether it will be faster. In general, IO bound tasks like this almost always run faster inside the database rather than streaming out of the DB, processing somewhere else, and streaming back in. Also, introducing more columns to mark the partitions is going to make your tables even bigger.</p> <p>I also find SQL very productive and find that performing significant logic outside of the database to not be any easier.</p> <ol> <li>You have to know SQL anyway to query the DB.</li> <li>It&#39;s a pain to have performant and reusable code. E.g. if you represent a row with a struct, your functions might assume you selected every field from the DB, so you tend to have to <code>SELECT *</code> instead of just using the columns you need. Similarly, if you need to join tables, it can be difficult to make sure your query and go functions agree on what joins are made.</li> <li>It&#39;s the same logic, but now in 2 languages instead of 1 with all the verbosity needed to glue them together.</li> <li>The optimizers of each language don&#39;t know about the other language.</li> </ol> <p>I would take a look at optimizing your SQL. It sounds like <code>INSERT ALL</code> (i.e. multi table inserts) would be very useful for this if you aren&#39;t already doing it by letting you scan your main table only 1 time. Are you doing joins? How many GB is your database? Is your database otherwise very loaded or using underpowered hardware? Unless you are doing something really strange or your DB is TBs, i don&#39;t see why it would take so long. If you don&#39;t know why it&#39;s this slow, then you don&#39;t know if rewriting it in go can make it faster, and it&#39;s almost definitely going to make it slower.</p> <p>Now if you are really set on doing this this is an outline of how i would do it.</p> <ol> <li>Parse SQL into AST. (I think there are go libraries for this)</li> <li>Analyze types of columns and expressions. You will need the table schema.</li> <li>Translate SQL AST to go AST. <ol> <li>Most AST nodes have simple translations like the one you have shown. Once you know the types, you can probably do this in 1 pass as a depth first traversal.</li> <li>Remember gotchas of SQL logic like comparisons to NULL.</li> <li>It&#39;s easiest if you make the whole WHERE clause a single function that takes all the columns it needs as arguments (this list is different than the SELECT clause and you will have to fetch the union of the columns to cover both).</li> </ol></li> <li>Generate go code from go AST.</li> </ol></pre>

入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889

705 次点击  
加入收藏 微博
0 回复
暂无回复
添加一条新回复 (您需要 登录 后才能回复 没有账号 ?)
  • 请尽量让自己的回复能够对别人有帮助
  • 支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
  • 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
  • 图片支持拖拽、截图粘贴等方式上传