当前位置:网站首页>Goodbye to the cumbersome Excel, mastering data analysis and processing technology depends on it
Goodbye to the cumbersome Excel, mastering data analysis and processing technology depends on it
2022-07-31 05:25:00 【m0_67402013】
大数据和人工智能时代,Data analysis is a hot topic,Data analysts seem to is also a popular career.Many laymen want to learn,But in the face of the vast and complex technology,Should also not sure where to start.于是,People often ask this question,Of course, if you ask a more specific,怎么能学会SQL啊,要不要学Python啊,Actually behind often are one thing.
Deliberately put in here“处理”Also hang up,Because data analysis does not exist alone,Used for the analysis of the data often need to get ready to go,These are all categories of data processing.比如把一堆Excel给合并起来,Maybe the next task is to analyze the,But this step is necessary.还有些事情,Such as employee card is generated by roster,It's not reasonable to as a data analysis clearly,But it is also in the daily work to work.This post will put this data over things together said.
另外,It's a topic for layman,When it comes to access to professional status after can't say anymore,After the relevant technology more and more complex,But the students can go there,Oneself who,Don't have to look at the posts of nagging.
Also emphasize,Data analysis is not just a technical problem,Could even say that the main problem is not in the technology.There are many students may think that just with some relevant skills will do data analysis after,It's just completely screw up.Do the data analysis is focused on business knowledge,Is you want to know what needs to be analysis.Business experience full of people with very primitive technology,Often use high-grade than someone who has no business sense technology can analyze more meaningful results.It's like to various kinds of vehicle driving is a skill,And know where to put the freight can sell a good price is more important.Freight by cart,As long as shipped the right place,Likely than use freight car earn more money.Don't superstitious doing data analysis techniques and tools,If you don't know to do data analysis business purpose,To learn techniques and tools are also no use.而且,Similar equivalent techniques and tools are usually almost,For data analysis, goal difference is very small(Mercedes-benz BMW did for an individual to take),Deliberately makes little sense to choose.
言归正传,This article is mainly about technology,After all, to master the technology or would greatly improve the work efficiency of.
We from the most basic speak:
你没看错,Everyone will make theExcelIs the most basic data analysis and processing technology,也就因为ExcelAlmost everyone will make,So this level is called the first0级,Don't put it into the serial number.
ExcelCan you do and how to do,这里就不多说了,网上资料太多.而且,Acting the layman and concerned data technology students,How much will also make someExcel.For most people this is only to the past,But it is not listed.如果觉得ExcelThat is not ripe(Such as not usingVLookup,不会写if),That to make up a missed lesson.Anyway, this is the foundation of the foundation,Big data analysts are now don't open it.
敏捷BICan help us to do is called the multidimensional analysis,, that is, all kinds of classification to see the statistics,Such as sales by month to see、According to the region to see production、Or a few watch.What a big remit total,Can be boring to look at what's going on,Such as the Beijing sales good,That should see if haidian people buy particularly high enthusiasm.Also can paint these statistics as all kinds of graphics to see,Tie them more image were also more likely to find problems or rule,This is no problem of data analysis.Analysis of the result to do a report to the boss,Promotion pay it all off.
敏捷BIRepresentatives of several products mainly foreign,PowerBI、Tableau之类;其实ExcelItself is also a pivot table,Can do basic multidimensional analysis,But in detail is not good enough,This function can also be counted as the first and it1Technology level.
Domestic reporting tools andBIManufacturers also have a lot of Shouting agileBI,然而,坦白地说,Without a proper.国产BI产品(Here also can say to)To be honest in terms of technical level,Are quite good at doing(In fact from a certain perspective than foreign product technology content is higher),But are very heavy,Don't agile,The reason is quite complicated,Beginners temporarily also don't understand.Anyway, if you see a non-professional workers themselves for domesticBIWhat product analysis out the results,Can be identified as the manufacturer of case with you.就记住,There is no simple and easy to use and non-technical people oriented domestic agileBI产品,The future big probability nor.
前面说了,We also met with the roster to generate employee card this annoying thing,敏捷BIHelp you out,Run dry statements is good at.
Run dry statements don't is a legend in the domesticBIProducts?It said a dozen face?
非也,Run dry statements is domestic reporting tool(也算BI吧)不假,但在第1Class technology in,它不是以BI技术出现的,It and other chinese-made goods,Not agileBI的本事,It is to do a report tool.
ExcelDo the report no problem,But can only make static,The ranks of changes according to the amount of data table,You won't do it,If have multiple layers of grouping of situation,The data has changed every month to redo a,It is really very tired very tired,And will be affected by any carelessness to make wrong.Such a statement,You say it count data analysis?这并不重要,Daily work always do anyway,And agileBIThey all make uncertain.
For dry Yu Run report,These things areeasy了,To do a template,Data becomes a key prob,Dynamic procession multilayer statements without any pressure.The Chinese statement technology world a,这可不是吹的.
Only run dry statements that why,Other domestic reporting tools can't?
Because only run dry report provides a non-professional persons in the workplace using oriented and relevant course materials,Indeed there are other reporting tools can do complex reports,But are geared to the needs of professional and technical personnel of(Maybe one day also have products will join the workforce market),At this level you classmates do not(Again in the future study can).
敏捷BIAnd run dry statements on behalf of the data analysis and processing is conventional technology,Involved in the operation difficulty and arithmetic is also a significant(Mostly addition,And the lion/The smallest etc),No more than junior high school level,是个人都能懂.But these operations and may involve a variety of conditions,比如只加500More orders、Just look at Beijing3The situation of the month, and so on,难度不大,Degree of great.Legend of tallBIAnalysis is actually something,和我们平时用ExcelTo do a report on detail didn't distinguish between,Just a higher automation,This stuff is broken no magic.
There is another line of data analysis technology,Artificial intelligence is now very hot(数据挖掘、机器学习、…),Here involves operation is much more complicated(But also inseparable from the conventional data processing technology to prepare the data),Every is probability theory、Statistics such as the formula of the,It's not easy,Beyond high school level,Not a good read math class of college students don't understand.
按说,Such a complex technology should not be the first1Level of things.
不过,With this model and deal is ok.
Modeling the complex artificial intelligence algorithm and deal all wrapped up,The user as long as the data ready(就是Excel的表格即可),丢进去,It can automatically build the model,Then you can predict.Because it has top statisticians decades of experience,For most business scenarios,It than not familiar with machine learning algorithms programmers doing ok.
Why bring up this single again?No other automatic product of artificial intelligence?
有,But it is too heavy,You still persist,Not suitable for players at this level.
这三种技术(产品)门槛都比较低,Also is to write formula,仅仅比ExcelIs a little bit,Can and listed as the first1级.
Modeling is to use the most simple and deal(Even little formula fill),But need to know some knowledge of data mining(Algorithms don't understand,To the concept of modeling prediction mechanism is a bit,Also to learn how to evaluation model is good or bad),And artificial intelligence also appear a high-end business,算作1.5Level also ok.
第1级的状态,Don't need to learn to program(If writing formula is not programming).
然后,Will begin to learn programming.
Some things will involve more complex calculation,Although each step or arithmetic difficulty,But much more steps in a formula it is difficult to write,Have to take steps,Process may also have judgment;More troubling is that some want to repeatN遍的事,If is manual operation it dead tired,比如把500个ExcelMerging together or take aExcel拆成500个小文件.这里有本书http://c.raqsoft.com.cn/article/1649301821440 ,In the vast majority of problems are actually happened,而用Excel以及第1Technology is difficult to realize high.
If learn programming,This is not a matter.
This easy to understand,But why is thisSPL,Programming language is not a lot?
是很多,But in this level there is no other.
To do data analysis,Basic are in dealing with the tabular data,It has a study called structured data,If what programming language is not good at dealing with structured data,It can only take do arithmetic to play,For data analysis in vain.然而,Can easily deal with structured data language program is running out,包括Excel里自带的VBA都不行,It is onlySPL和SQL,还有PythonLook also can deal with(There are many training classes in the claim that),但SQL和PyhtonAre professional technology(后面会说),出现在第2Level too early.
About the amateurs to learn programming,还可以参考这个:http://c.raqsoft.com.cn/article/1612232365820
The above is the amateur level stuff.After to get into the professional level,Is the programmer's world.
Programmers data processing technology in the world of,SQLIs the first.The database may be stored data most place,And where data is mainly usedSQL来耍,不会SQLIt is difficult to mix in data circles.
简单SQLLooks like English,Grammar is easy.If you have already mastered the concept of structured data and operation,Estimates are only a few hours on grammar can write a query,Normally should not be a professional technology.但是,You learned to no place to use,Have to have a database to runSQL,Install the database and the data can go in to get to do a query,Are professional task,Done this sort of thing you have is a semi-professional programmer.
倒是可以用SPL来学,SPLAlso implements the commonSQL,Allows the user to file executionSQL.不过,如果学会了SPL,In data processing analysis file also did not need to learnSQL了,SPLProcessing power is far better thanSQL.
There is a post about a beginner how to learnSQL:http://c.raqsoft.com.cn/article/1619312554522
Repeat this post finally emphasized:SQL入门容易,但精通很难,复杂的SQLRemains very complicated.Three lines of five lines of simpleSQLUsually may appear in large Numbers only in textbooks and training,Reality is used to do data analysisSQL语句,We usually use when talking of lengthK(千字符)Instead of using line as a unit.
It is at this level is suitable.
第3级,Domestic reporting tools and enterpriseBI软件
Now that said domestic reports and turnBI软件了,These things can be seen as the first1Level of agileBIAnd run dry statements enterprise version of the things you did.
Construction of enterprise applications can also use statements andBI功能,The domestic software technology to do much, much better than foreign(This may be the only one the basis of the domestic than foreign software).Several well-known domestic good,Of course also have distinguishing feature each,这里就不点名了.Master these products and technology after,As a programmer to make enterpriseBIAnd the statement analysis business,就会得心应手了.
Using these technologies to the system,It can be used for non-professional personnel.But we are not in the first1Level to mention,Is ready for use within the enterprise application system does not need to learn what technology,According to the operating rules do we have to do is,Really a bit too hard,Companies also provide unified training,So I don't have to say here.
As a systems programmer builders,Than users know more.学习这些技术(产品)At the same time also wants when learn to multidimensional analysis、Statements were a bit deeper theoretical knowledge in the model.
弱弱地说一句:国产BIAlthough the high technical content,But gm to build enterpriseBI用处并不大(看这里http://c.raqsoft.com.cn/article/1612925790144 ,Also explains why there are no domestic agileBI).But don't have to tube,Anyway, now has customers willing to pay,Learn these techniques can had something to do,In the future to do specialized enterpriseBIAlso need this knowledge.
SPL又来了,It also can be used as a weapon of programmers to do structured data processing and calculation.
Programmers are not have database andSQL吗?
是,But there are many scenarios can't use database also can't useSQL.Programmers will do to the file data processing,Have more than one database can't direct runSQL,能跑SQLThe database is called a relational database,And a pile of not relational database database,比如MongoDB,还有json,xmlThese strange format of the data.这时候,掌握SPLTechnology is much more convenient,It does not require a database,What scene can be.
而且,SQLIn the processing process operations there are a lot of time is not convenient,Enterprise applications often occurs in before we said for severalK十几K的SQL.And most of the timeSPL要简单得多,Development efficiency is much higher.Companies report development experience of the students all know,Report endless done,开发成本极高.But our domestic reporting tool has been presented to solve very well,Will still appear this kind of phenomenon,Is mainly because withSQL或JavaTo achieve such as the difficulty of data is too big too.会了SPL,These things will be much easier.
这一级别的SPL要比第2Level of the complex,Need to learn how to connect to the database, and various other sources,Also know how to be application calls, etc.
Say again the line of artificial intelligence.As has just entered the professional level programmers,Also it is difficult to make complex artificial intelligence model,So still need to use theseAutoML(自动机器学习)的工具.前面说了,In addition to modeling and deal is a lightweight product,The industry to provide otherAutoMLProducts are a little heavy,Although still don't need to how much knowledge of artificial intelligence algorithm,But I still want to kind of programmer has to sleep over(Not the data processing skills,But deploy application system debugging, such as the programmer skill),So the column to this level.
This kind of products and technologies at home and abroad have a little,Abroad more mature,大名鼎鼎的google也在做.Other did not call,The Internet search yourself.
顺便说一句,Modeling and deal also have the programmer's version,也可以算是AutoMLA column to this level.
类似地,This can also be counted as3.5级.
然后呢?And then to take further study in professional level period,The technology is much more.
搞性能优化,用SPL,But the focus is on the high performance algorithm;玩大数据,用Hadoop/Spark/…;To do artificial intelligence to learnPython,More is to learn math;….All the level,Don't have to say.
If a question,PythonThe grade will be so high,Not many claim that it is to make the workplace training personnel to improve the work efficiency of things?
没搞错,PythonSomething is very professional,会PythonAnd the vast majority are severe pro,It just looks like a simple,To learn basic programming logic is no problem,But want to put this to use to deal with the structured data is not easy,It is completely unlike some training classes declared after can let the ordinary people learn to workplace can be used to manage the daily work.The topic links in front of the zero base personnel to learn programming post has discussed,Interested in can seek to reference.
PythonArtificial intelligence technologies involved with large open source library,Said in front of the automatic machine learning products easy to use,For many scenarios also enough,But a top player also hope to be able to control more,And automatic machine learning software there are many cannot solve the scene,This needs to begin.不过,This need to really understand the artificial intelligence technology,The technology is the core of mathematics(统计学).
除Python外,Similar products technical and commercialSAS、开源的R、以及MATLABThere are many algorithms,都用PythonOn behalf of the.
- MySQL-Explain详解
- Mysql——字符串函数
- 【云原生】DevOps(五):集成Harbor
- Sql解析转换之JSqlParse完整介绍
- <urlopen error [Errno 11001] getaddrinfo failed>的解决、isinstance()函数初略介绍
- 城市内涝及桥洞隧道积水在线监测系统
- Centos7 install mysql5.7
- The MySQL database installed configuration nanny level tutorial for 8.0.29 (for example) have hands
- MySQL forgot password
- 矩池云快速安装torch-sparse、torch-geometric等包
ERROR 1064 (42000) You have an error in your SQL syntax; check the manual that corresponds to your
12 reasons for MySQL slow query
View source and switch mirrors in two ways: npm and nrm
Moment Pool Cloud quickly installs packages such as torch-sparse and torch-geometric
Lua,ILRuntime, HybridCLR(wolong)/huatuo hot update comparative analysis
Unity资源管理系列:Unity 框架如何做好资源管理
Simple read operation of EasyExcel
CentOS7 install MySQL graphic detailed tutorial
MySQL optimization: from ten seconds to three hundred milliseconds
What are the advantages and disadvantages of Unity shader forge and the built-in shader graph?
mysql stored procedure
Apache DButils使用注意事项--with modifiers “public“
Sun Wenlong, Secretary General of the Open Atom Open Source Foundation |
面试Redis 高可靠性|主从模式、哨兵模式、Cluster集群模式
The Vue project connects to the MySQL database through node and implements addition, deletion, modification and query operations
110 MySQL interview questions and answers (continuously updated)
mysql uses on duplicate key update to update data in batches
MySQL database backup
质量小议12 -- 以测代评
【debug锦集】Expected input batch_size (1) to match target batch_size (0)
sql statement - how to query data in another table based on the data in one table
快速掌握并发编程 --- 基础篇