当前位置:网站首页>Inventory of common tools used by dry goods | data journalists

Inventory of common tools used by dry goods | data journalists

2022-06-24 04:54:00 Octopus big data

The coming of big data era , It has brought obvious changes to all aspects of people's life , And the data news generated around the data , It has become a new carrier , With the description it has 、 Judge 、 Prediction and other functions bring convenience and quickness to the majority of readers .

But the production of data news also brings higher requirements to the news team , Not just writing 、 survey 、 Interpreting data 、 Basic professional abilities such as drawing , Also learn to work with programmers 、 Data analysts work closely with web developers . If you can flexibly use tools to deal with work , Many problems can be solved easily .

Small eight Cong Data collection 、 Data analysis 、 Data visualization Three aspects have sorted out some tools commonly used by data journalists , Collect it quickly !

01. Data acquisition tools

Data collection (Data Scraping) Also known as data capture or web page capture , Is the use of computer programs to collect text and data from web pages , And organize it into a format convenient for analysis . The more common method is to use R Language or Python To write “ Reptiles ” Program , besides , The existing acquisition software can also be used , It can collect the required web page data without programming foundation .

1. Octopus collector

Octopus collector is a very suitable collector for novices . It has the characteristics of simplicity and ease of use , So you can get started quickly in a few minutes . To make it easier to use , Octopus is ready for beginners “ Easy template for website ”, Covering most of the mainstream websites on the market . Using simple templates , Users can collect data without task configuration . If you want to grab websites without templates , The official website also provides very detailed graphic and video tutorials . Besides , You can also set the timing of Cloud Collection , Real time access to dynamic data and regularly export data to the database or any third-party platform .

2. Scrapinghub

If you want to capture foreign website data , You can consider Scrapinghub.Scrapinghub It's based on Python Of Scrapy Framework of the cloud crawler platform .Scrapehub It is a very complex and powerful network capture platform in the market , A solution provider that provides data capture .

3.WebScraper

WebScraper Is an excellent foreign browser plug-in . It is also a visual tool suitable for novices to grab data . We simply set some capture rules , The rest is left to the browser to work .

4. Import.io

Import.io Is a web based data capture tool . It's on 2012 It was first launched in London in . Now? ,Import.io Take its business model from B2C Turned to B2B.2019 year ,Import.io Acquired Connotate And become a web data integration platform . With a wide range of Web data services ,Import.io It's a great choice for business analysis .

5. Parsehub

Parsehub Is a web based crawler program , Support collection using AJax, JavaScripts Web data of Technology , It also supports data collection of web pages that need to be logged in . It has a one week free trial feature .

6. Mozenda

Mozenda It's a web crawler , It can also provide customized services for commercial data capture . Users can grab data from the cloud and local software and host the data .

02. Data analysis tools

1. Excel

Despite all these years ,Excel Still a classic tool for processing data . Today, when all kinds of advanced data analysis software are popular , Most data analysis projects can still be used Excel solve , And it's easier to learn . Like summarizing data 、 Visualization data 、 Data cleaning and other important functions ,Excel Can support . No matter how many data analysis tools you know ,Excel Be familiar with the . For simple logical analysis and small data sets ,Excel It can fully meet the requirements of data cleaning , meanwhile Excel You can also use classification 、 clustering 、 Association and prediction algorithms are used to realize simple data mining .

2. Tableau Public

Tableau It's an interactive data visualization tool . Rich visualization Library , It's easy to operate . Unlike most visualization tools that require scripting ,Tableau It's easy for beginners to use it . Like a huge PivotTable , There is an interactive visual dashboard , Drag and drop data fields for data analysis in a visual way . They also have one “ Starter Kit ” And rich training materials , Help users create more analysis reports .

3. Power BI

Power BI It's a set of business analysis tools , Used to provide insights in the organization . Can connect hundreds of data sources 、 Simplify data preparation and provide ad hoc analysis . Generate beautiful reports and publish them , For organizations in Web And mobile devices . Everyone can create a personalized dashboard , Get a comprehensive and unique insight into their business . Expand within the enterprise , Built in management and security .

4. FineBI

FineBI It is a new generation of business intelligence products for self-service big data analysis , Provides data preparation from 、 Self service data processing 、 Data analysis and mining 、 A complete solution for data visualization .FineBI The feeling of using is the same as Tableau similar , Both advocate visual exploratory analysis , It's a bit like the enhanced PivotTable . Easy to get started , Rich visualization Library . Can act as a portal for data reports , It can also serve as a platform for business analysis .

5. Qlikview

Qlikview It is one of the most popular tools in the field of business intelligence in the world , It has excellent data analysis and visualization functions , And it's easy to operate . At the data processing level , By clicking , You can easily delete duplicate lines 、 Empty replace 、 Data tailoring 、 Data desensitization 、 Type conversion, etc .QlikView Allow users to browse data with one click , The system automatically matches the most appropriate graphic display database data , Help users preliminarily understand the data rules , Secondary analysis can also be carried out on the basis of digital portrait . The types of charts are rich , All charts can be linked without any settings , You can also select some charts to participate in linkage drilling . It also supports one click selection of statistical methods .

6. Trifacta

Trifacta The data collation tool innovates the traditional data cleaning method , therefore Excel Data processing is sometimes limited by the size of the data , and Trifacta There is no such concern , It can be safely and boldly used to deal with super large data sets . in addition , Like chart recommendation 、 built-in “ Open the box ” The algorithm of 、 Analytical insights and other functions , Can make it very convenient for you to generate data analysis reports .

7. Rapid Miner

This tool is more than just a data cleaning tool , It can also be used to create machine learning models , It integrates all commonly used machine learning algorithms . In terms of data analysis ,Rapider Miner Provide light and fast analysis function , And big data 、 visualization 、 Model deployment, etc . If the business involves loading from data 、 cleaning 、 From analysis to model building and deployment ,Rapider Miner It can definitely help .

8. Weka

Weka One advantage of is that it's easy to get started , The interface is very intuitive . It provides data preprocessing 、 data classification 、 Data regression 、 Data clustering and visualization . first Weka It is a tool designed by Waikato University in New Zealand for research purposes , But now more and more professionals are using it .

9. Data Preparator

This tool allows us to complete data mining 、 Data cleaning and data analysis , Built in a variety of toolkits , Can handle discretization 、 Numerical calculation 、 Data scaling 、 Attribute selection 、 Missing value 、 outliers 、 Statistics 、 Sampling, etc . A special benefit of this tool is that the data set used for data analysis does not occupy computer memory , So you won't encounter memory problems when dealing with large data sets .

10. DataCracker

Data analysis software dedicated to processing research data . Now many companies collect research data , Data research is also an indispensable step in data news , And the survey data need to be cleaned up , There are a lot of missing values and outliers .DataCracker It can help us quickly clean up and analyze the research data . It can also load data from many mainstream research projects .

03.  Data visualization tool

1、Pyecharts

Python Is slowly becoming data analysis 、 One of the mainstream languages in the field of data mining . stay Python In the ecology of , Many developers provide a very rich 、 Data visualization third-party library for various scenarios . These third-party libraries allow us to combine Python Language draws beautiful charts .Echarts( As we'll see ) It's open source and free javascript Data visualization Library , It allows us to easily draw professional business data charts . When Python I met Echarts,pyecharts It was born. , It is from chenjiandongx Wait for a group of developers to maintain Echarts Python Interface , So we can go through Python Language draws all kinds of Echarts Chart .

2、Bokeh

Bokeh It's based on Python Interactive data visualization tool , It provides an elegant and concise way to draw a variety of graphics , It can visualize large data sets and stream data with high performance , Help us make interactive charts 、 Visual dashboard, etc .

3、Echarts

Echarts It's open source and free javascript Data visualization Library , It allows us to easily draw professional business data charts . Baidu migration, previously reported on a large scale 、 Sinan Baidu 、 Baidu big data forecast, etc , The data visualization of these products is through ECharts To achieve .

4、D3

D3(Data Driven Documents) It's supporting SVG Another kind of rendering JavaScript library . however D3 Can provide a large number of complex chart styles in addition to linear charts and bar charts , for example Voronoi chart 、 Tree diagram 、 Circular clusters and word clouds, etc .

5. CartoDB

CartoDB Is an interactive map making tool , Provide “ One click mapping ” function , After uploading data, a series of map formats will be automatically recommended for users to select and modify , Convenient and practical , Suitable for people who lack programming foundation and want to try Visualization . The program was originally developed by two Spanish scientists studying biodiversity and nature conservation , So far, it has more than 12 Million users , Especially loved by data journalists .

6. Google Fusion

Fusion Tables It belongs to Google Drive An application in the product , It is a drawing tool with complex functions , Apply to CSV and Excel And other common data formats . Mapping , One of its characteristics is the ability to integrate different data sets , And the function of geographic information coding is also very prominent . Record geographic information KML(Keyhole Markup Language) Is its common format .

7. TimelineJS

TimelineJS Used to make news events timeline , It is a free and open source visualization tool , At present, we support 40 Languages . You need to use Google Spreadsheet Prepare a form according to the format requirements , Copy table links to TimelineJS, Then you can automatically generate a timeline .

8. Infogram

Infogram It's an intuitive visualization tool , Can help you create beautiful information charts and reports . It provides more than 35 An interactive chart and 500 Multiple maps , Help you visualize data . Except for all kinds of charts , And then there's the histogram 、 Bar chart 、 Pie chart or word cloud, etc .

9. BDP Personal Edition

BDP Personal edition is a free online data visualization analysis tool , Don't need to download , From data access integration , To data processing 、 analysis 、 mining , Then to multi terminal visualization , Help users greatly improve the efficiency of data analysis , By simply dragging and dropping fields , Present all kinds of exquisite visual charts .

10. Dysprosium number chart

Dy number chart is a powerful free online data visualization tool , Input data to generate a visual picture with one click , Web interaction chart , Dynamic data map 、 Vector chart and information chart support, including word cloud chart , Sanguitu , Rose chart , River Map , Radar chart, etc 110 A variety of chart types ; Provide thousands of visual templates , Content creation 、 Media operation 、 Marketing posters 、 market research 、 Thesis writing 、 Job summary 、 The visual design of personal resume and other scenes can be easily done in dysprosium number .

原网站

版权声明
本文为[Octopus big data]所创,转载请带上原文链接,感谢
https://yzsam.com/2021/09/20210901101518372O.html