当前位置:网站首页>How engineers treat open source -- the heartfelt words of an old engineer
How engineers treat open source -- the heartfelt words of an old engineer
2022-07-07 00:20:00 【Microservice spring cloud】
How engineers treat open source
This paper is about the author who has been engaged in open source related work in a well-known science and technology enterprise for more than 20 Year of engineer , Experience or witness the excellent practice of many engineers in dealing with open source software , Also saw a lot Bad Cases, So I want to write some of my experience here , For the engineer's reference , Hope to help engineers grow better .
summary
As an engineer who carries out technical work in science and technology enterprises , The task is to use technical means to support and achieve the business objectives concerned by the company . In the actual work process , Need to actively or passively use and maintain a large number of open source software . According to statistics , When each engineer carries out R & D, operation and maintenance and other work within the enterprise , Thousands of open source software are exposed every year , If so Java or JavaSciprt Engineers who develop languages for major programs , The number of open source software exposed is more , At the level of 10000 or even 100000 . ( Data sources :《2020 State of the Software Supply Chain》 from Sonatype Release )
So how to choose Open Source Software ? In so many open source software , How to choose appropriate open source projects according to individual needs and business needs , It needs comprehensive consideration .
After choosing open source software, how to customize and maintain it for a long time ? It's also a big problem . Because developing software inside the enterprise , Unlike personal development software , The cost of developing or maintaining a computer system is far greater than the cost of developing the software . After choosing open source software , How to customize and modify from a long-term perspective , How to carry out subsequent long-term maintenance , In order to be efficient and cost-effective , There are many good experiences in the industry , There are also many unsuccessful cases as lessons .
Finally, go back to personal , The growth of engineers is carried out in continuous learning and practice . How to use open source to improve your ability , Expand your horizons , Improve their technical reputation and industry influence , It is also very important for the engineer himself .
This paper will elaborate from the following three parts :
- How engineers choose Open Source Software
- How engineers customize and maintain open source software
- How engineers use open source for personal growth
1. How to choose Open Source Software
First of all, we should clarify our attitude towards open source software , At this stage, it is impossible to leave the use of open source software . There are various risks in using open source software , Including open source compliance 、 Security 、 The question of efficiency . Simplify to one sentence : Use open source software within the enterprise , Need to comply with the enterprise's internal regulations on Open Source Software , Including how to import and how to maintain , To achieve efficiency 、 Security 、 Use of compliance .
Back to the question of how to choose specific open source software , There are several latitudes for reference .
- According to the demand
- According to the development trend of Technology
- According to different stages of the software adoption cycle
- According to the maturity of open source software
- According to the quality index of the project
- According to the governance mode of the project
1.1 Choose open source software according to your needs
Choose Open Source Software , First of all, we need to be clear about the needs , That is, what is the purpose of choosing this open source software . Engineers choose an open source software , What exactly is it used for , It's for personal learning ; Or to satisfy ToB Customer needs ; It is also used to meet the needs of internal service development . Under these three different purposes , The direction of choosing open source software is completely different . ( Be careful : The latter two scenarios need to consider the requirements of enterprise open source compliance first , See Chapter three )
Let's talk about choosing open source software for personal learning , Then we need to see what the specific purpose of personal learning is . I want to learn a popular technology to improve my technical knowledge structure and expand my technical vision ; I still want to see the specific implementation of the corresponding open source technology project , As a reference for internal project technology development ; Do you want to prepare for the next job technically . Different purposes lead to different choices . For the former , Obviously, choose what technology is the most popular , Choose what you lack ; For the second purpose , Generally, it is a targeted selection of well-known open source software or innovative software in this technical field , That is, a feature that I currently need , Or my current project is not well implemented , I need to see how others do it . The last one , Obviously, it is prepared according to the position needs and technical requirements of the next job , And select according to the threshold required by the technology stack . But notice , Choose open source software based on personal needs , Generally, you need to write a small project to practice your hand , For example, a Demo Program or a test service , Because there is no need to consider subsequent long-term maintenance , Therefore, we can carry out various exercises according to personal ideas and personal R & D habits , Do not follow the internal development process and quality requirements , There is no need to consider the stability and community maturity of the open source software , Just learn and refer to the code as much as you like .
Then look at the next requirement , The software that chooses open source software for research and development needs to be provided to customers , Often it may still be delivered in the form of private cloud . Choose open source software based on such needs , Pay attention to balance , That is, the needs of customers and the needs of the enterprise's own technical planning or long-term product planning . Enter the customer's in the form of private cloud IDC Environmental Science , It needs to be integrated with upstream and downstream projects of customer development and operation environment . At this time, it depends on the needs of customers , Maybe some customers have specific requirements for open source software , For example, ask for the use of HDFS And it's a specific version . Requirements for the name and version of such specified software , It may be because the customer is familiar with this version , It may also be because of the software and versions previously provided by other software and hardware suppliers , The specified purpose is to facilitate integration and subsequent use and maintenance . If this demand is in line with the long-term development needs of enterprise projects or products , It can be completely satisfied . If Party A is very strong , There is no other way but to meet his requirements , Then choose the software and version specified by the customer . However, if it is inconsistent with the long-term development needs of its own projects or products , Moreover, the specific project or version can be negotiated with Party A , Then we need to negotiate with the customer to reach a mutually acceptable result , That is, to choose a specific open source software and version, we should not only satisfy customers and pay the bill , But also to achieve their own delivery costs controllable , We should also meet the long-term development needs of our own projects or products . For example, customers use Java An old version of , But the of enterprises toB The delivered software requires the use of Java A higher version of . Then we need to negotiate with the customer , Or switch to the version that the enterprise wants , We also need to help customers upgrade their existing systems ; Or it can only reduce the cost of its own software Java Version requirement , You may also need to modify some of your own code , It is also possible to modify some dependent components in the software . This scenario is a choice with many objective constraints , Need to talk to customers , The product manager and the architect negotiate together .
Last , If the scenario is to meet the needs of internal services , That is, the services built with open source software are for internal business or end users , It is common in the Internet service systems of major domestic Internet companies and on various mobile phones App. At this time, the developer and maintainer of the project have greater autonomy , Follow toB Our delivery business is completely different . At this point, choose Open Source Software , We must comprehensively consider the development and maintenance costs , Also consider the stage of the business using the service .
(1) If the service provided is for innovative business , Innovation business is generally trial and error business , It needs to be adjusted according to the changes of market conditions and the current implementation status at any time , It is likely that the project will be gone in three months , It was cancelled . In this case “ Rough and fast ” The development method is more appropriate , Don't think too much about the maintainability and scalability of the system , Use the software technology stack most familiar to the R & D team , Then use the underlying technology to support the team, such as the mature and verified underlying basic technology platform provided by the infrastructure team , The most important thing is to build the system as soon as possible , Then iterate quickly with the product . At this time, we need to minimize the learning cost and development cost of the existing R & D operation and maintenance team , Don't think too much about maintainability costs , Because you need to stack the system quickly , Verifying product requirements and business model is the most important , Time is of the essence . If market opportunities are found , Just follow up quickly , After gaining a firm foothold, you can save time but cost resources ( Be commonly called “ Pile up the machine ”) To expand , Or use “ Changing the engine while driving the plane ” It is more cost-effective to rewrite the mode of . For enterprises or projects in the entrepreneurial stage , Speed trumps everything .
(2) But if you choose the computer software system or service built by open source software , It needs long-term maintenance , For example, it is used for mature business in the company , Or upgrade the system according to the shortcomings of the mature platform in the company and replace the original products , So on the premise of meeting the business needs , Considering the maintainability of the system becomes the most important thing . Select the corresponding open source software , Whether it is mature , Is stable ; Is the secondary development friendly ; Whether the operation and maintenance cost is more cost-effective, that is, saving machines and bandwidth ; Whether the operation and maintenance is convenient , For example, whether common capacity expansion and shrinkage operations can be efficient 、 Automatically 、 Complete without damage ;Upstream Whether it is easy to get to the upstream open source community, etc , These have become key considerations . In this case , The cost of developing a system , It may only account for the cost of the whole system life cycle 1/10 Less than . So on the premise of meeting the needs , Focus on maintainability .
1.2 Choose open source software according to the development trend of Technology
As shown in the figure above , Research and development of modern computer software or services , It is a continuous cycle and iterative process . Start with market analysis , Then enter the creative stage , Then to the coding stage , Finally, the deployment and effectiveness of the application will be completed in the online stage , After going online, feed back according to the data obtained , Continue with the analysis . The process of this loop iteration , Obviously, for an enterprise in a highly competitive industry , The faster the iteration, the better , It also needs to be fast and elastic 、 Low cost scalability , That is, the product direction is right , So we need to expand the system capacity , Undertake the rapid growth of traffic , Achieve rapid growth ; If the product is in the wrong direction , We need to shrink the volume quickly , Save relevant hardware and human resources , Invest in new trial and error directions . Enterprises in the same industry , If the enterprise A At a lower cost , Faster iteration of various products and Strategies , Obviously, it can be slower than iteration , High cost Enterprises B It has a better competitive advantage .
There are a lot of open source software now , There are many open source projects under almost every category . For a specific need , How to choose ? One suggestion is to choose based on technology trends . That is, the current way of computer system iteration is Agile( agile ) + Scale( Expand ). obviously , Able to support rapid iteration of computer system , Open source software that can be easily scalable at low cost is worthy of long-term investment . And the learning and use of a new open source software , Learners hope that the lower the learning threshold of the software, the better . A popular open source software , The internal implementation can be as complex as possible , But for users, it must be user-friendly . Otherwise, even if the degree of innovation is better , Poor usability , Only geeks can learn and master , The innovation gap is hard to bridge .
for example Docker After the emergence of , Swept the world at an extremely fast speed , A lot of engineers like Docker. Because of Docker Characteristics of , New features have been added to the traditional container system , Including encapsulating the application and the underlying dependency library into a container image , Container image has version , And it can be stored and distributed in large quantities through a centralized mirror warehouse .Docker Firstly, it solves the development problem that has plagued engineers for a long time 、 test 、 The standardization of online environment , It can support developers to iterate quickly . At the same time, a unified image warehouse is used for image distribution , And the bottom layer adopts the technology of lightweight virtual machine, namely container , Can be pulled up very quickly , So using Docker The system can be easily extended elastically . meanwhile , Because the application App Encapsulated in an image , Can be logically based on Domain Model Design principles for better abstraction and reuse . obviously , Such technology is worth learning and mastering by every engineer who develops computer system . Because it can bring great convenience . contrary , stay Docker Before it came into being , although Control Group( abbreviation cgroup) + Namespace Technology has been around for a long time , And has long been integrated in Linux The kernel ,Google Of borg Relevant papers have long been published , However, it is not easy for the general technology R & D team to control the container and deploy the container system on a large scale within the company . In my impression borg After the paper appeared , There is only BAT Level of Internet companies , Only a small group of elite R & D teams have developed and used the container management system , For example, baidu is responsible for Matrix System R & D team , Ali is in charge of Pounch System R & D team , Tencent also has a small team responsible for the research of container system . But except for that small group , More engineers do not use a large number of containers because of the relatively difficult learning difficulty . and Docker This kind of technology , It is very good to comply with the technical trend of agile and elastic expansion , And it provides very good user ease of use , Then it was quickly used by many engineers as soon as it appeared , And become the default standard of the market .
These open source software that conform to the trend is worth choosing and investing .
Another example is Spark,Spark The emergence of MapReduce In the process of distributed computing, it needs to be done frequently IO Low performance caused by operation , At the same time, it has a great improvement in ease of use , That's why it replaced MapReduce In the field of distributed computing .
1.3 Choose according to different stages of the open source software adoption cycle
Software as a product of intellectual activity , He has his life cycle , It is generally expressed by the Technology Adoption Curve of software .
Open source software is also a kind of software , They all follow the law of software technology adoption . As shown in the figure below :
An open source software usually goes through... From its creation to its decline 5 Stages . From the innovation period (Innovators, Proportion 2.5%), To the early adoption period (Early Adopters, Proportion 13.5%), Then cross the gap (chasm), Into the early mass period (Early Majority, Proportion 34%), Then enter the later popular period (Late Majority, Proportion 34%), Finally into recession (Laggards, Proportion 16%). Most open source innovation projects , There is no successful cross domain gap , That is, from the early adoption stage to the early mass stage , It died . therefore , If you choose an open source project that needs to be used and maintained for a long time , It is more rational and scientific to choose projects in the state of early mass or late mass .
Of course, if you just want to learn a new thing , Take a look at open source projects in the innovator state , Or take a look at “ Early adopters ” Status items .
Note that whether from the perspective of long-term R & D system , Or from the perspective of personal learning , Don't even look at the recession (Laggards) The project . For example, at this stage, i.e 2022 year , No more choice Mesos,Docker Swarm Such projects . since Kubernetes Become the default scheduling container , These two projects are already in recession , Their parent companies have given up . If more energy is invested in development and maintenance at this stage , Unless it is really a very strong request from Party A , Throwing money in front of engineers makes you have to use it to choose .
Students may ask , Where can I see these technology adoption curves ?
InfoQ,gartner,thoughtworks Their respective technology adoption curves are updated and published every year , You can search on the Internet , See what their respective technology adoption charts are , Then combined with some industry experience , Get your own judgment .
for example https://con.infoq.cn/conference/technology-selection?tab=bigdata
You can see from here 2022 year InfoQ Yes BigData Judgment of various popular technologies in the field .
As can be seen from the above figure ,Hudi、Clickhouse、Delta Lake And other open source software are still in the stage of innovators , That is, it is less adopted in industry , Students who want to learn new projects can focus on . But now these open source software are not suitable for application in mature application scenarios that need long-term maintenance .
Note that the Technology Adoption Curve of these well-known technology media is updated every year , When making reference, don't forget to pay attention to the publication time .
1.4 Choose open source software according to the maturity of open source software
And a little bit more , That is, open source software is selected according to the maturity of open source software itself . That is, whether the open source software is released regularly , Whether it is in a multi-party maintenance state ( Even if a company's strategy changes, it no longer continues to maintain , There are other companies in the long-term support ), Whether the documents are complete and other dimensions to evaluate the maturity .
For the maturity model of open source software , The open source community has many maturity models for measuring open source projects , among Apache The project maturity model of the open source software foundation is quite famous .
You can refer to here : https://community.apache.org/apache-way/apache-project-maturity-model.html
According to this Apache Open source project maturity model developed by the open source software foundation , He put the evaluation dimension of an open source project , It is divided into 7 Dimensions :
- Code( Code )
- License and Copyright( Software license and copyright )
- Release( Release )
- Quality( quality )
- Community( Community )
- Consensus Building( Consensus building )
- Independence( independence )
There are several items to investigate in each latitude . For example, for Independence( independence ), There are two more inspection items , One is to see whether the project is independent of the influence of any company or organization , The second is to look at the activities of contributors in the community, which represents their personal , Or as a representative of the company or organization to appear in the community and carry out activities .
Apache The foundation Top Level The project is the top-level project , In the graduation stage, we will make a comprehensive judgment from these dimensions . Only projects that meet the standards in all aspects , Will be allowed from Apache Graduate from the hatching state of the foundation and become Top Level Project . This also forces individuals to prefer Apache Reasons for top projects .
in addition ,OpenSSF Project Criticality score ( See https://github.com/ossf/criticality_score) It is also a good reference index , It measures the number of community contributors to a project 、 Submission frequency 、 Distribution frequency 、 Indicators such as the number of dependencies , To judge the importance of an open source software in the open source ecosystem . We will not expand it in detail here , Interested students can refer to its materials , Personally, I think it is a direction worthy of reference , But this score is still in its early stages , It's still far from ideal .
1.5 Select according to the quality index of the project
Obviously , The code quality of some open source software is better than that of other open source software . Sometimes you need to choose open source software from the quality of the project .
This is the time , We need to look at some indicators that have been widely proved to be more effective in the industry .
among MTTU Is a well-known supplier of open source supply chain software SonaType Recommended indicators . It mentioned in its famous annual supply chain report MTTU. See https://www.sonatype.com/resources/state-of-the-software-supply-chain-2021
MTTU(Mean Time to Update): That is, the average time for open source software to update the version of the library it depends on . For example , Some open source software A Rely on open source libraries B, hypothesis A The current version of is 1.0, rely on B The version is 1.1. One day, open source library B Version from 1.1 Upgrade to 1.2, And then after a while , Open source software A A new version has also been released 1.1, One of them is right B The dependent version of is from 1.1 Upgrade to 1.2. This time interval , That is, from the open source version B Upgrade to 1.2 Point in time away from open source software A A new version of the 1.1 The release time of , be called Time to Update, It reflects open source software A R & d team , According to the update cycle of the dependent Library , The ability to synchronously update its dependent versions .Mean Time to Update It refers to the average upgrade time of this software . The lower the value, the better the quality , It indicates that the person in charge of the software is rapidly upgrading the versions of various dependent Libraries , In the timely repair of security vulnerabilities caused by various dependent Libraries .
According to the SonaType The statistics of , Update and upgrade time of open source software in the industry MTTU It's getting shorter . According to its statistics , stay Maven On the central warehouse Java Class open source software ,2011 Annual average MTTU by 371 God ,2014 Annual average MTTU by 302 God ,2018 Annual average MTTU yes 158 God , and 2021 Annual average MTTU Time is 28 God . You can see it , As the update frequency of open source software library accelerates , The software that uses them also accelerates the speed of updating versions ,MTTU relative 10 Years ago , Shorten the time to the original 10/1 following .
Of course MTTU There is only one indirect dimension of project quality . Whether there are important and high-risk security vulnerabilities in history , Whether the repair response is fast and timely , And so on are also important dimensions for quality evaluation of open source projects .
The safety department of some big factories , Will continuously evaluate the security of open source software , Put some high-risk security vulnerabilities that occur repeatedly , However, open source software that is not repaired in time is set as unsafe Software , Included in the internal blacklist of open source software and publicized internally , And require all business R & D teams to stop using these software , In fact, due to R & D and manpower problems, it is also necessary to migrate these old services to a relatively closed network environment , Reduce the possible loss caused by risk . This is the time , Obviously, the company's safety regulations should be observed , Stop using open source software on the blacklist .
1.6 Consider from the perspective of the governance model of the open source community to which the open source software belongs .
There is another dimension , That is, consider the community governance model of this open source project , It is applicable to projects requiring long-term development and maintenance .
Community governance model (Governance Model) It mainly refers to how the project or community makes decisions and who makes decisions . Specific performance: : Can everyone contribute or a few ? The decision is made by vote , Or through authority ? Whether plans and discussions are visible ?
There are three common governance modes for open source communities and open source projects :
- Single company led : It is characterized by the design of software 、 Development and release are controlled by one company , And don't accept external contributions . The development plan and version plan are not made public , Relevant discussions are not made public , The source code is only released when the version is released . for example Google Of Android System .
- Dictators dominate ( There is a proper noun “Benevolent Dictatorship”, Translated into “ A benevolent dictator ”): The characteristic is that one person controls the development of the project , He has strong influence and leadership , Usually the founder of the project . for example Linux Kernel from Linus Torvalds To be responsible for ,Python Before by Guido Van Rossum To dominate .
- The board of directors takes the lead : The characteristic is that a group of people form the board of directors of the project to decide the major matters of the project . for example Apache The project of the software foundation is supported by PMC decision ,CNCF The foundation's decision is CNCF The board of directors is responsible for ( Many technical decisions have been authorized to CNCF The technical supervision committee under the board of Directors ).
Personal opinions and experience , According to the governance mode of the open source community behind the open source software, the priority is as follows :
- Preference Apache Graduation project ( Because the intellectual property rights of these projects are clear , And at least three parties are in long-term maintenance )
- Second best choice Linux Key projects of other open source foundations such as the foundation ( because Linux The operation ability of the foundation is very strong , Each key project is often supported by one or more large companies )
- Carefully choose a company led open source project ( Because the enterprise's open source strategy may be adjusted at any time , It is likely that there will be no continued support for the project , for example Facebook It's a company with a lot of abandoned pits )
- Try not to choose personal open source projects ( Personal open source is more casual , The risk is particularly high , But we can't rule out that some of them are already well-known , And run out of the project of long-term maintenance mode , For example, the famous open source author you Yuxi (Evan You) Responsible for Vue.js Open source software ).
This is the recommended priority order for selecting similar open source software projects , Just represent personal views , Welcome to discuss .
2. How to customize and maintain
Introduce an open source software into the enterprise and use it for development and long-term maintenance , There is a problem of how to customize and maintain . First of all, make it clear , After the introduction of open source software into the enterprise, it needs to be customized . For the following reasons :
- Open source software is often suitable for general scenarios , There are many situations to consider , Need to support a variety of usage scenarios . But after being introduced into the enterprise , Often only for enterprise specific scenarios . So optimize for these specific scenarios , For example, clipping all functions , Remove features that are irrelevant to the scene , Performance tuning and parameter optimization for specific scenarios , Often achieve better performance , For example, it can resist more traffic , The effect of saving machine cost is amazing . This is also a common customization method .
- Open source software should be developed and operated for a long time to enter the enterprise , It needs to meet various internal service operation and maintenance specifications of the enterprise . For example, the business goes online , It is necessary to have a complete log and monitoring , For example, you need to provide a service health check interface , It also needs fault-tolerant processing such as traffic scheduling . These need to be customized and modified .
- Open source software also needs to connect with the upstream and downstream systems within the enterprise , For example, if the software runs correctly, it needs to rely on the underlying distributed storage and distributed computing system to complete the basic functions , It needs to connect with the existing storage system or computing system in the enterprise ; The underlying virtual machine system or container scheduling system within the enterprise , There are often some modifications and optimizations , It also needs to be modified ; So it needs to be customized and modified at this time .
- Demand customization in special scenarios , In the enterprise application scenario, using the open source software will often encounter specific problems , May come across Bug, It all needs Bugfix And new features to support .
2.1 How to customize and modify open source software ?
Regarding this , The author suggests that there are several basic principles : The core code of real open source software , Try to use the existing plug-in mechanism of the open source software ; Or change it on the periphery ; Regularly upgrade to a stable version of the open source community .
A lot of open source software is designed at the beginning , There are many extension mechanisms left , It is convenient for subsequent developers to expand functions and add features . For example, some of the most famous open source software Visual Studio Code,Firefox Browser Provided. Extension Mechanism , Many developers develop corresponding plug-ins according to their own needs , And submit the plug-ins to the officially supported plug-in market . After installing the main program, ordinary users , You can also browse the plug-in market , Find and select the plug-ins you need to install . In addition, as Kubernetes, It also provides extension mechanisms in multiple places , For example, where does the core scheduler provide customized scheduler, It can be used to develop personalized scheduling strategies ; The underlying storage and network provide many plug-in mechanisms ; The most commendable thing is that it provides CRD(Custom Resource Definition) The mechanism of , Allow developers to define new resource types , And reuse Kubernetes A statement of maturity API And scheduling mechanism , Carry out very convenient operation and maintenance . therefore , Try to use the existing plug-in or extension mechanism of the open source project to add features .
Modification and customization for some open source software , It's not very suitable to use its extension mechanism , Or it doesn't provide an available extension mechanism . The revision at this time , Try to modify the periphery of the source code core , Instead of touching its core code . Because open source software is iterative with the progress of the open source community , The development of the open source community will continue to bring more and better features . If the core code is modified , And when you need to upgrade to a newer open source version , It will be very painful . Because there are a lot of internal Patch Need to merge , And it requires a variety of tests , It will lead to high upgrade cost and failure to synchronize with the major versions of the community , Finally, due to the resignation or job transfer of some core Engineers , No one can maintain the modification of that part continuously , The whole system cannot be maintained and upgraded , Finally, the whole system is abandoned or pushed down to start again , This will lead to a lot of labor costs . The author has worked in many large Internet factories for many years , I've seen too many such projects , Too many modifications to open source projects , It's very necessary , But because of the core code changes , As a result, it is too expensive to upgrade to a newer version of the open source community , Finally, no one can maintain the system , We have to push down the example of a new start .
Let's give you an example , I saw two technical teams in a large factory maintaining Redis colony , The versions used at that time were Redis 2.x edition . Because there are not many clustering functions , Poor support for large-scale business , So both teams are right Redis Of 2.X The version has been modified . The team A The change method is to change in the periphery , That is to say Redis There's a layer of it , Used for traffic scheduling ,Failover Processing and other functions ; The team B Just change it harder , Change directly Redis Core code , Add the code related to the cluster function directly , Even in some local test scenarios , Better performance . In a short time , Both teams can meet the needs of the business line . however Redis The open source community is constantly iterating , Keep adding more and better needs , When Redis Come out 3.x When , Both teams want to upgrade to a newer version , Because use Redis Our business partners also want to use 3.x Version of . But the upgrade costs are significantly different , The team A Soon transplanted the relevant functions to 3.x above , Soon Redis The version has been upgraded ; The team B Well , Because the changes to the core are too big , The cost of transplantation and testing is too high , So you can't be right 2.x Upgrade the version of the service . Wait until the community 4.x Version out , The team B After the core engineer left , The Redis No one in the cluster can continuously maintain and meet the new version requirements of customers , I had to push it down and start again , From the community 4.X Build a cluster directly from the version , Its own system has been migrated for a long time , It also brings a lot of costs to customers .
therefore , Modifications to the source code of open source software , All suggest Local Patch( Local patch ) The way , To facilitate maintenance and upgrading , Second, it is also convenient for management and statistics . In this mode , Compilation script of internal project , It is usually to untie a source package of the open source software , And then through patch Order these Local Patch One by one , Then compile and test together . Rather than taking Patch Directly call the business source code , Although in CI The phase saves a few minutes , But subsequent maintenance 、 upgrade 、 Management adds considerable trouble .
2.2 Give back to the community ,Upstream( feedback ) To the upstream open source community , Reduce maintenance costs
Engineers work within the enterprise for a version of an open source software , Add functional features or Bugfix after , In general, I will use Local Patch( Local patch ) The way exists in the code base . The author suggests that engineers solve business problems after , Try to put these Local Patch Submit to the upstream open source community to which the open source software belongs , complete Upstream The process of .
Upstream There are several advantages :
- Can get better code
Add features to an open source software within the enterprise, especially Bugfix Patch for , Often because of time urgency , It's more about “Hack” The way , That is, in order to quickly go online to solve problems , Where the patch is fixed is not necessarily reasonable , There may be loopholes in the logic of the code patch , The code patch may not handle more exception conditions perfectly, and so on . This is the time , If you take this Local Patch Bring it back to the open source community of the open source project , With the senior engineers of the open source community (Module Reviewer/ Module owner ) After in-depth communication , Often based on their feedback , Better improve the code patch , So you can get better code .
- Can reduce maintenance costs
Internally retained Local Patch, Every time you upgrade to a newer version of open source software , these Patch All need to be evaluated , Some need to be combined and tested . Of course, I hope these Local Patch The smaller the number of . The best way is to include these when the open source community releases a new version Patch. The more it contains , Internal needs assessment 、 Need to merge and test Local patch The smaller the number , The lower the cost of upgrading . Remember Fedora In the release version of , Each version retains a lot of for the kernel and other components Local Patch, Red hat engineers are constantly putting these Local Patch Contribute and join the upstream open source project community , So that we can keep Fedora Inside local patch The quantity is at a relatively low level , It also ensures that the cost of upgrading the version is relatively controllable .
- Establish team technology brand and employer brand , Easy to recruit , And enhance the pride of Engineers ,
Contribute code to the upstream open source technology community ,Upstream these Local patch, Can get a better community reputation . Show these technology communities that the company is not just a consumer of open source software , Also a contributor .
At the same time, it can establish a strong team technology brand , It shows that the company is not only good in business , The technical team is also very strong , This is convenient for external recruitment .
Upstream To the upstream open source community , At the same time, it also helps to improve the pride and satisfaction of engineers in the team .
for instance , Xiaomi is using a lot of Apache HBase When the project is , The responsible R & D Engineer will resolutely implement Upstream The strategy of , Constantly verify Xiaomi's internal Patch Contribute back to HBase Community , And on and on Hbase The students in the community discuss and develop some features together . Xiaomi is HBase The influence of the community is growing , Constantly produced Committer and PMC, Finally, Xiaomi engineer Zhang duo became the of the project PMC The person in charge is the of the project PMC Chair. Xiaomi is in big data 、 Technology brands in cloud computing and other fields , Largely from the R & D team related to this project .
3. How to use open source for personal growth
The growth of Engineers , Closely related to his daily work , It is also closely related to his daily study . In the process , How to use open source software , To better help engineers grow , Help engineers realize their professional or technical ideals , Here are some suggestions .
3.1 Open and shared , Vision and mentality
Stand on the shoulders of giants to stand higher . There are all kinds of software in the open source world , For various scenarios , Solve all kinds of problems . So we must keep an open mind , That is, before doing anything technology related , Let's see how others do it first . You know, the world is so big , Problems encountered by engineers 99.99% The above problems have been encountered by others , How others solve it , What experience can you learn , In particular, you can look at other people's open source projects , Look at their design documents , See how they think ; Look at their source code , See how they do it . If you are interested , You can also further communicate directly with them . One can take a lot of detours , Avoid a lot of unnecessary duplication of work , Avoid stepping on the pit repeatedly . Second, there is no need to build wheels repeatedly , You can spend your limited time on more valuable work . Never sit back and watch the sky , I am the best in the world , Look more at industry and open source , Students who have just graduated and go to big factories need to pay special attention to .
In addition, we also need a shared mentality , It's best to share what you've learned , Let others refer to , Learn from experience and lessons , So as to achieve the purpose of common improvement .
3.2 Learn the recommended steps and methods of open source software — Feynman method of learning
There are various ways to learn open source software . For different learning purposes , It also needs to be based on its own situation ( That is, the familiarity with this field and the understanding of relevant open source projects ) Adopt different learning methods that are more suitable for yourself .
Here I recommend a method suitable for engineers to learn a new open source technology project :
- Get started as soon as possible , Put this open source software Quick Start( Quick start ) and Tutorial( Introductory tutorial ) Run , First understand its main scenarios and key features .
- Then look at the document , Pay attention to the main architecture diagram of the system , Understand the general architecture of the whole system , Establish a relatively large overall framework diagram .
- Finally, combined with their own practical application scenarios to see the relevant details , Documentation and code that includes a detail .
such as , If you want to learn Kubernetes, Go to its official website first , Run the tutorial provided on its official website quickly (https://kubernetes.io/docs/tutorials/kubernetes-basics/create-cluster/cluster-interactive/), Learn how to create pod, How to access the , How to update , How to schedule traffic and so on . Then look at its architecture diagram , Understand its design principle, that is, declarative programming , Including several core components Kube-apiserver,kube-scheduler,kube-controller-manager, kubelet And how these components interact ; Finally, according to the needs of their own business scenarios , See which parts need more in-depth understanding . For example, you need to add your own storage method , Then look at the relevant code , Refer to the implementation of storage methods of other friends .
It is not recommended to hold the source code first , In this way, there is no clue , And it's inefficient . Besides, many open source projects are now Too Big 了 , And the iteration speed is very fast , It's hard for anyone to understand all the code , And I can't do it with my personal energy , Not to mention there is no need .
Note that learning must be combined with application , That is to do .“ It's on paper , We must know that we must do it .” The ancients said , I will not deceive you , This is especially true for Engineers . If you want to have a deeper understanding of a new technology , There are even plans to switch between technical routes and professional tracks , Then we must do more , Use this open source software , Or write some programs and run Demo And run in the experimental environment , It's best to solve some practical problems around you . Don't look too high or too low , I think everything is very simple , But really want to run , It's hard to use . You can try to participate in some innovation activities in technology enterprises , for example hackthlon( Hacker song ) Activities , Use the new technology ; Or write a little gadget , Let him run , Solve a little practical problem . for example , If you want to practice Python, Write a reptile , Climb the data on the weather forecast website every day , Then do a simple query , You can get the current weather forecast . In use middle school , Put this to use .
Another very useful learning method is Feynman learning method . Feynman learning method is considered the most effective 、 One of the most powerful learning methods , For pro test tube . The steps are also very simple , I simplify it into the following three steps .
- First learn a technology
- Tell it to ordinary people , Let him understand
- If the audience doesn't understand , Back to step one
In this way , Only you can explain the usage and architecture of this technology , And let ordinary engineers understand this technology , It's really mastered .
Feynman's learning method comes from Richard, the Nobel Laureate in physics · Feynman (Richard Feynman). He is a well-known theoretical physicist , One of the founders of quantum electrodynamics , The father of nanotechnology , Because of its contribution to quantum electrokinetic Physics 1965 Won the Nobel Prize in physics in . The learning method he advocated , It's called “ Feynman method of learning ”. Although the steps are very simple , But it can simplify complex technologies , And let ordinary engineers speak in a way that they can understand , This requires a deep understanding and mastery of this technology , We also need to compare some proper nouns and concepts 、 Association to simplify . You can usually do this , It shows that you have reached the level of entry to this technology , You can continue to further study .
In addition, it is also a better way to participate in the examination or certification of some famous courses in the industry . For example, engineers unfamiliar with cloud primitives , When he passed CKA(Certificated Kubernetes Adminitrator) authentication Kubernetes After the administrator exam , This certification can verify that he has a certain level , Have established a right to Kubernetes I have a comprehensive understanding of common operation and system architecture .
3.3 Into the open source community , Lifelong personal reputation
And finally , For Engineers , Participate and integrate into the open source community and make positive contributions , Will get a lifetime reputation , And make lifelong friends , It is very conducive to the long-term development of Engineers . Here it is , Encourage Engineers , You can choose the open source projects and communities you are interested in , And through communication and contribution , Continue to grow in the community . Even in the future, because of working relationship or other reasons , No longer active in this open source project and community , But his contribution will always be recognized .Apache The open source software foundation has a famous motto :“Merit never expires”( See http://theapacheway.com/merit-never-expires/), That means engineers are Apache Open source software foundation projects and community contributions are recognized , It will never be out of date . Once a submitter , Always the submitter .
Collaborate in the open source community , It's also a way for engineers to socialize . ad locum , Can know lifelong friends , Be able to work and communicate with them , It is also very effective for the growth of Engineers . A lot of big cattle in the open source community , It's also very friendly in the community , Especially for newcomers , Treat more junior But engineers with a strong desire to contribute , More willing to teach hand in hand . With the help and guidance of these Daniel Engineers , The growth of newcomers is very fast , And there are no businesses / department / Ceiling brought by work items, etc . That is, newcomers can work in open source projects and communities they are interested in , With an open-minded attitude and a desire to contribute , Constantly communicate and learn from senior engineers in the community , Can bring about the rapid development of technical ability .
in addition , For today's engineers , It's hard to have a business with lifelong employment , Engineers work in enterprises for a period of time , With all kinds of passive or active changes , The position or enterprise will also change . But the recognition of contributions in the open source community , And established personal brand and technical reputation , Is always with the individual , It will not change due to the situation of the company or enterprise . You can see a lot of people who have been active in the open source community , Although there have been many career changes , But their recognition and brand in the open source community has always existed . This is also a career breakthrough for many engineers , A good way to break through the limitations of the platform .
Long term contribution to the open source community , It's a good thing for others and yourself , Encourage everyone to have ideas , An engineer with action , Can find open source projects and communities they like and invest in , And blend in .
3.4 How to contribute to the open source community
In the open source community , Especially those communities that respect Elite Governance ( for example Apache The project of the foundation ), The more you contribute , The more recognition you get . But a lot of times , As a new person , To contribute to the open source community , It's not something you can do with your hands up , You need to know some community rules first , Then abide by the rules before you can slowly integrate into .
1. Contribute something ?
Before making a contribution , We need to understand , Contributions to the open source community are not limited to code contributions , Write code to add functions or Bugfix It's contribution , Improving documentation and test cases is a contribution , Report usage issues are contributions , Blogging about projects and recommending projects are also contributions , These are widely recognized contributions within the open source community .
Many community technology giants , Contributing to the open source community starts with submitting test reports . For example, that year Mozilla The youngest architect in the community Blake Ross(17 At the age of Mozilla One of the highest technical decision-making levels in the community , And founded... With another architect Firefox project ), He first entered Mozilla Community , As an intern , From the beginning of the test .
“Scratch your own itch!” This is a popular saying in the open source community , It means contributing to the open source community , You need to solve your own problems . That is, problems encountered in practical work , Then try to solve , Finally, contribute the solution results to the community in a way accepted by the community . In general, there is a Bug Or the problem affects the user's actual application , Or want to add a new function to meet the enterprise's own scenario , Or just want to learn some new technology . This contribution to solving their own needs , It's a long time . And for some petty profits , Get rewards for participating in some activities run by the community , For engineers, it's just for fun, This contribution is not long-term .
therefore , For a new person , Enter the open source community , Contributions can start with simple questions , Start by addressing your own needs . A simple example , First look at the novice introduction document , Follow the steps described in the document step by step , See if you can get through ; If it doesn't work , You can quote one Bug come out ; Or personal experience requires some additional steps to get through , You can give this novice introduction document a Patch, Describe these additional steps , This is also a welcome contribution from the community .
Some communities put some simple Bug Set to “Good First Issue”, Contributors can choose these Issue To contribute , To get familiar with the contribution process , And integrate into the community .
2. Understand the existing community , Respect the practices and habits of the community
The first step in contributing to the open source community is to understand the community .
You can use the community's website 、 Mailing list, 、Wiki、github Documents and other materials in the code warehouse , Learn some basic information about the open source community .
By viewing key documents (Contributing.md), Understand the contribution process and recommended methods of these projects .
Note that each open source community has its own conventions , For example, they have their own Issue Management system ( Some may use github Of Issue, Some use Bugzilla, Some use Jira), And then submit Patch The process and requirements are also different .
For example, a very long history Apache HTTP Server project , Its requirements for contributors are as follows :
- Patch Need to meet their Code Style
- There are also some requirements for code quality, such as thread safety
- Patch Need to target the current development version –2.5.X To make a comparison
- Patch Use the format of diff -u file-old.c file.c To generate
- Submit Patch The entrance of is in bz.apache.org/bugzilla, Advice and “PatchAvailable” keyword
- Can be in mail list Send an email to discuss , Mail title Need to be for [PATCH ]
Note that the way they use is not github It's popular Fork/Pull Request Pattern , But older Bugzilla+Diff Patch The pattern of , Please respect their working habits , Use the pattern they require .( To be honest , The author 20 Years ago in Mozilla When the community makes a contribution , The working mode is also Bugzilla + Diff Patch The way . More than 20 years have passed ,Apache Of HTTP Server The working mode of the project has not changed much . But the way you work doesn't affect your contribution , Just get familiar with it and get used to it .)
Some open source communities , Will provide a Gamification contribution process , That is, let developers get familiar with the project and contribution process through a series of simple novice tasks . This way is more friendly to new people , It is also carefully designed by the community manager of the community . So for contributors , Don't live up to their good intentions , Go through the tasks you feel necessary , Just be familiar with the tasks and processes you want to be familiar with .
3. Attitude requires “Be Polite and Respectful”, Respect the diversity of the community
The open source community is full of diversity .
Most senior engineers in the open source community are very friendly to newcomers , They will teach new people patiently , Be familiar with documentation , Familiar with the contribution process, etc ( Note that usually only once , Don't disappoint ). In daily communication , Include in mailing list , stay IRC perhaps Slack Channel , stay Issue comments in , All comparison nice. It's easier to communicate and collaborate with them .
But notice , There are also some people whose relative attitude is not particularly good , If you meet , Be careful not to have a head-on conflict . It is suggested that you can ask some more senior engineers in the community for help , Instead of being rigid on the front . It's impossible to change anyone , It's impossible for everyone to like , Just finish the necessary work .
4. How to quickly find the responsible code Review Of Module Owner, Complete the contribution
sometimes , Follow the documentation of the community contribution process , Whether it's about Issue Or newspaper Bugs, When I find that the feedback from the person in charge of the module is very slow , There are some skills at this time .
They can join IRC perhaps Slack channel , Find the corresponding module owner , Then have a polite and constructive dialogue with the module leader .
Build a good relationship with them , And through practical contributions , Gradually build their trust .
Be careful , The operation of the open source community is based on trust . Can gain the trust of the module leader , It is very conducive to the future work .
5. Submit big Patch Note the steps
There may be feedback from engineers , I submitted a very good feature to the so and so open source community , Test and verify... In my company's internal working environment , The effect is very good , Very good performance . But when I submit the code to the upstream open source community , Find that the community doesn't value this feature , But to my Patch Pointing fingers , Pick out all kinds of problems . It's too troublesome , Too tired , Don't contribute at all .
You need to think about it , If a stranger submits a big deal to your project Patch, Code Review It's hard to implement , because Patch The larger . Although contributors say this Patch It is useful to , Realized a powerful function , And after his verification , But whether he is reliable , Whether he can exist in the community for a long time , Whether he can fix the problems caused by the code he submitted in time , These are question marks . So before building basic trust ( That is, submitted several small Patch And get the sum into ), Submit big Patch It's hard work .
in addition , Submit this Patch Engineers often don't understand the history of this open source community , Maybe this function has been discussed in the community for a long time , Maybe the conclusion of the discussion is that it doesn't need to be done or done elsewhere . therefore , Don't be blindly confident in your Patch, Instead, we should first communicate this scenario and problem with the community engineers .
The author suggests that the steps of contribution are as follows :
- If you judge this Patch The larger , So first discuss the problem in the community , Let the community recognize this problem , At the same time, we can also get some historical information of the community on this issue ( If any )
- If the community recognizes the problem , I think it should be repaired now , Continue to discuss solutions
- After the problems and ideas have been recognized , And finish a little design , Then discuss the specific code Patch
- Patch Need to follow the norms of the community (CodeStyle、 Component call specification 、 Test specifications 、 Document specifications, etc )
- Prepare yourself mentally ,Patch It may need to be modified several times before it can be finally combined into , You may need to put a big Patch Split into several small Patch, Batch submission and import . A certain compromise is needed when necessary .
Contribute a big Patch, Realize an important function , Although there are many steps , Although the time period is long , But when it's done , Can be highly recognized by the community , It is often the basis for becoming a higher-level contributor . And for individual contributors , Inner satisfaction and sense of achievement are also very sufficient .
6. Be careful not to do the following
- Put forward a Idea, I hope others will complete .
Especially when you just joined a community , Just suggest that the community needs to do something , But don't do it yourself , I hope others in the community will complete , These opinions are often ignored .“ There are many people ,‘ Get off at the beginning of ’, Just talk about it , A complaint , This also criticizes , That's not to blame , In fact, ten of these people will fail .” This kind of person is even more unpopular with the community .
Ask a question , At the same time, provide a constructive solution , And I want to participate in , You can invite others in the community to come together . This is the recommended practice .
2. Too eager , Lack of patience , And ignore the conventions of the community .
I'd rather slow down , Especially before the community builds trust in new people , Want to have patience . I once met an engineer who just entered the open source community , Technical ability is very strong , But I just want to get his Patch And go in . When communicating with the person in charge of the module , Although the attitude is polite , But the response to the improvement suggestions given by the person in charge is very perfunctory . After a few tosses , The contributor's reputation in the community has been lost , He is related to Bugfix And the development of new features is slow , He later left the project sadly .
3. Don't touch the red line ( That is, some bad behaviors prohibited by the code of conduct of the community )
Basically, every mature open source community has its own code of conduct (Code of Conduct), This file is usually displayed in a prominent position on the community website or code warehouse .
The specification lists a number of actions that are not welcomed by the community , Including gender 、 race 、 Discrimination and offense in religion and other aspects .
Be careful not to have these behaviors , Possible behavior is not considered a big problem in China's open source community , But in the international community, it's not necessarily a small thing .
7. Pay attention to compliance issues when making contributions to upstream communities within the enterprise
Contribute to the upstream community within the enterprise , Because it is to disclose the results of internal research and development of the company , Therefore, we need to meet the internal open source contribution management measures of the company .
Each company has different regulations on this . For example, Google encourages engineers to contribute to the open source community , But the engineer is required to google.com Email address to contribute ,100 The following contributions do not need to be approved by the internal process , But only if the project does not adopt Google Prohibited licenses ( for example AGPL,Public Domain,CC-BY-NC-*), In addition, there are some hard conditions , See Google OSPO Link to the official website of https://opensource.google/documentation/reference/patching. Domestic Baidu companies also encourage engineers to contribute to the open source community , No matter what Patch All sizes need to go through internal electronic processes , Approved by the technical director of the Department , And handed over to Baidu's open source management office (OSPO) Filing , In order to provide data support for the subsequent data statistics of the open source office and the contribution incentive to engineers .
When making contributions to the upstream community within the enterprise , It is often encountered that the community requires engineers to sign CLA(Contribution License Agreement, Contribution license agreement ) perhaps DCO(Developer Certificate of Origin, Developer original statement ) Things about . among CLA It is divided into ICLA(Individual Contributor License Agreement) and CCLA(Coperation Contribution License Agreement, Enterprise contribution license agreement ), among ICLA It's personal ,CCLA It's for the whole enterprise , That is, if the enterprise signs CCLA after , If the engineers inside the enterprise make contributions, they don't have to sign separately ICLA 了 . Don't sign CLA Words , You can't submit Patch.CLA The content of the clause is that contributors authorize their contributions to the community for use . At this time, please abide by the internal regulations of the company , dependent CLA The terms may need to be carried out through the internal legal affairs of the company Review. But the good thing is that some famous projects CLA Clause , for example Apache The open source software foundation's projects use a unified CLA file ,CNCF The foundation's projects are similar . Of these famous projects CLA Clause , After legal confirmation, there is no legal problem . If it's not confirmed by the legal department CLA, Need to consult with the legal affairs in charge of the company , Avoid some unfavorable to the enterprise CLA.
summary
This article is relatively long , Condense a lot of my experience and experience .
I always think engineers are very pragmatic , A group of people who work very hard , It's a group of people who deeply believe “ We can use code to change the world ” People who , It's a group of people who think “Talk is cheap,Show me the code”、“ Daily arch , The efforts paid off. ” People who . I always thought “ to open up 、 Collaboration 、 Pragmatic ” Is one of the best features of contemporary Engineers .
Learn in the open source world 、 Work 、 Share , Is one of the best ways for engineers to change the world .
边栏推荐
- MVC and MVVM
- 为什么完全背包要用顺序遍历?简要解释一下
- 谷歌百度雅虎都是中国公司开发的通用搜索引擎_百度搜索引擎url
- PostgreSQL uses pgpool II to realize read-write separation + load balancing
- How rider uses nuget package offline
- St table
- Introduction au GPIO
- Leecode brush question record sword finger offer 58 - ii Rotate string left
- 数据运营平台-数据采集[通俗易懂]
- Why should a complete knapsack be traversed in sequence? Briefly explain
猜你喜欢
DAY SIX
MVC and MVVM
Everyone is always talking about EQ, so what is EQ?
基于GO语言实现的X.509证书
基于jsp+servlet+mysql框架的旅游管理系统【源码+数据库+报告】
@TableId can‘t more than one in Class: “com.example.CloseContactSearcher.entity.Activity“.
什么是响应式对象?响应式对象的创建过程?
准备好在CI/CD中自动化持续部署了吗?
System activity monitor ISTAT menus 6.61 (1185) Chinese repair
48 page digital government smart government all in one solution
随机推荐
Core knowledge of distributed cache
Leecode brush questions record sword finger offer 43 The number of occurrences of 1 in integers 1 to n
How can computers ensure data security in the quantum era? The United States announced four alternative encryption algorithms
What is AVL tree?
TypeScript增量编译
Operation test of function test basis
沉浸式投影在线下展示中的三大应用特点
GPIO簡介
Wind chime card issuing network source code latest version - commercially available
DAY TWO
华为mate8电池价格_华为mate8换电池后充电巨慢
Geo data mining (III) enrichment analysis of go and KEGG using David database
DAY FOUR
什么是响应式对象?响应式对象的创建过程?
DAY THREE
[boutique] Pinia Persistence Based on the plug-in Pinia plugin persist
AVL树到底是什么?
PostgreSQL使用Pgpool-II实现读写分离+负载均衡
Matplotlib draws a histogram and adds values to the graph
Leecode brush questions record interview questions 32 - I. print binary tree from top to bottom