当前位置:网站首页>Understand the quality assurance of open source software (OSS)
Understand the quality assurance of open source software (OSS)
2022-06-22 14:07:00 【51CTO】

This article will introduce you to developing Milvus Vector database QA frame , And covers Milvus Main test modules in 、 And can be used to improve QA Methods and tools for testing efficiency .
If quality assurance (QA) It is a systematic process to determine whether a product or service meets specific requirements , Then the quality assurance system is an integral part of the R & D process , It plays a role in ensuring product quality .
In this paper , I will introduce you to the development of Milvus Vector database (Vector Database) It's the same as QA frame , And covers Milvus Main test modules in 、 And can be used to improve QA Methods and tools for testing efficiency .
PART 01
Milvus QA System Overview
Whereas the system architecture is very important for QA The importance of testing ,QA The more familiar the engineer is with the system , The more likely it is to work out a reasonable 、 Effective test plan .

Milvus framework
Milvus 2.0 Using cloud native 、 Distributed layered architecture . among ,SDK Is the data in Milvus The main inlet of the medium flow . Through the frequent use of SDK Perform functional tests , We will be able to detect Milvus Problems within the system . In addition to functional testing , We should also unit test the vector database 、 The deployment of test 、 Reliability test 、 Stability test 、 And performance testing .
The cloud native and distributed architectures are QA Testing brings convenience and challenges . Different from the system running on the local deployment , stay Kubernetes Deployed and running on the cluster Milvus example , It can ensure that in the same environment as software development , Do software testing . However , The disadvantage is the complexity of the distributed architecture , It will bring more uncertainty , This will cause the system QA The test is tedious . for example ,Milvus 2.0 Microservices using different components , This will lead to an increase in the number of services and nodes , An increase in the probability of system errors . therefore , We need a more comprehensive QA plan , To improve the efficiency of testing .
PART 02
QA Testing and problem management
Milvus Of QA It is necessary to test and manage the problems in the process of software development .
1、QA test
As shown in the figure below , We should base on Milvus Features and user requirements , In order of priority , Carry out different types of QA test .

QA Tests and priorities
stay Milvus in ,QA The test mainly aims at the following aspects :
- function : Verify whether the functions and features can work according to the original design .
- Deploy : Check whether the user can do it in different ways ( Such as :Docker Compose、Helm、APT、 as well as YUM etc. ) Deploy 、 reinstall 、 upgrade Milvus Stand alone and cluster .
- performance : test Milvus Data insertion in 、 Indexes 、 Vector search and query performance .
- stability : Check Milvus At normal workload levels , Whether it can run stably 5-10 God .
- reliability : If a system error occurs , test Milvus Is it still possible to partially run .
- To configure : verification Milvus In a specific configuration , Can it work as expected .
- Compatibility : test Milvus Whether it is compatible with different types of hardware or software .
2、 Problem management
There may be many problems in the process of software development . These problems may stem from QA The engineer himself , It may also come from the open source community Milvus user . however ,QA It is the responsibility of the team to identify these issues .

Milvus Workflow of problem management in
When creating a problem , They first need to classify . In the process of diversion , New problems detected shall be ensured with enough problem details , So that developers can confirm 、 Accept 、 And trying to fix . And after the repair , If the problem belongs to the owner, you need to verify its repair , Determine whether the problem can be finally closed .
PART 03
When do you need QA?
A common misconception is that :QA And development are independent of each other . The fact is to ensure the quality of the system , Developers and QA Engineers need to work together , take QA Throughout the life cycle .

take QA Introduce the whole software development life cycle
As shown in the figure above , A complete software development life cycle includes three stages :
- In the initial stage , Developers publish design documents ,QA The engineer makes the test plan accordingly 、 Define publishing standards 、 And assign QA Mission . Developers and QA Engineers need to be familiar with design documents and test plans , To share information about release goals between two teams 、 function 、 performance 、 stability 、 Mutual understanding of error convergence, etc .
- During R & D , Development and QA Testing through continuous interaction , To verify the developed features and functions , And fix bugs and problems reported from the open source community .
- In the final stage , They can release new versions that meet the release notes and labels Milvus Of Docker Mirror image . meanwhile ,QA The team will also publish test reports on this release .
PART 04
Milvus Test module in
below , Let's elaborate Milvus Six test modules in :
1、 unit testing

unit testing
Unit testing can help identify software errors as early as possible , And provide verification standards for code reorganization . according to Milvus Pull request for (pull request,PR) Acceptance criteria , The coverage of code unit tests should reach 80%.
2、 A functional test
stay Milvus in , The main purpose of functional test is to verify whether the interface can run according to the design . By surrounding PyMilvus and SDK Develop , The functional test will involve the following two aspects :
- test SDK When passing the correct parameters , Whether the expected results can be returned .
- test SDK Can I handle errors , And when passing wrong parameters , Whether a reasonable error message can be returned .
The following figure depicts the current mainstream based pytest Functional testing framework for . The framework is PyMilvus Added a wrapper (wrapper), And test through automatic test interface .

Milvus Functional testing framework in
Considering that the test mode is shared , Some functions need to be reused , So we can use the above testing framework , You don't have to use it directly PyMilvus Interface . Besides , The framework also includes a “ check (check)” modular , It is convenient to verify the expected value and the actual value .
Its tests/python_client/testcases The directory contains up to 2700 Functional test cases , And completely covers almost all PyMilvus Interface . And functional testing can strictly monitor each PR The quality of the .
3、 The deployment of test
because Milvus Yes standalone and cluster Two modes , So we can use Docker Compose or Helm Two important ways , Deploy it . meanwhile , In the deployment of Milvus after , Users can restart or upgrade the test . among , Restart testing is the process of testing data persistence , That is, whether the data after restart is still available . Upgrade test refers to testing the compatibility of data , In order to prevent in Milvus The process of inserting incompatible data formats into . As shown in the figure below , Both types of deployment tests can share the same workflow :

Deploy test workflow
In the restart test , Both deployments will use the same Docker Mirror image . however , In the upgrade test , The first deployment will use the previous version of Docker Mirror image , The second deployment uses a later version of Docker Mirror image . The test results and data will be saved in Volumes File or persistent volume declaration .
The first test creates multiple collections at run time , And different operations will be performed on each set . In the second test run , It focuses on verifying that the created collection can still be used for CRUD operation , And whether you can further create new collections .
4、 Reliability test
Reliability test of cloud native distributed system , Chaos engineering is usually used (Chaos Engineering) Method , The aim is to nip errors and system failures in the bud . let me put it another way , In chaos engineering test , We purposefully create system faults , To identify problems in stress testing , And repair it before the system fault really begins to cause harm . stay Milvus In the chaos test of , We can choose Chaos Mesh As a tool for creating chaos , To create the following fault types :
- Pod kill: Simulate the node downtime scenario .
- Pod failure: The test has a worker Node pod Failure time , Whether the whole system can continue to work .
- Memory stress: The simulation comes from worker The node pair has a large amount of memory and CPU The consumption of resources .
- Network partition: because Milvus Can separate storage from computing , Therefore, the system will rely heavily on the communication between various components . To test different Milvus Interdependencies between components , We need to simulate different pod The communication between is partitioned .

Milvus Reliability testing framework in
The picture above shows Milvus A reliability testing framework for automated chaos testing . The process is :
- First , Read initialization through deployment configuration Milvus colony .
- When the cluster is ready , function test_e2e.py To test Milvus Whether the functions of are available .
- function hello_milvus.py, To test data persistence . That is, create a new one called “hello_milvus” Set , For data insertion 、 Refresh 、 Index building 、 Vector search and query . This collection will not be released or discarded during testing .
- Create a monitoring object , This object will start six threads ( The following code snippet shows ), Execute creation separately 、 Insert 、 Refresh 、 Indexes 、 Search and query operations .
- Make the first assertion —— All operations can run successfully as expected .
- Use Chaos Mesh Analyze and define the fault yaml file , Introduce system faults into Milvus. for example , Every five seconds “ kill ” Query nodes at once .
- A second assertion is made when a system fault is introduced —— Judge during system failure ,Milvus Whether the result returned by the operation meets the expectation .
- adopt Chaos Mesh Eliminate faults .
- When Milvus Service recovery ( That is all pod All ready ) after , Make a third assertion —— All operations are as expected .
- function test_e2e.py, To test Milvus Whether the function is available . After chaos is eliminated , Some operations may continue to be blocked , Thus hindering the third assertion . therefore , This step is intended to facilitate the third assertion , And as a check Milvus Criteria for service recovery .
- function hello_milvus.py, To load the created collection , On the assembly CRUP operation . then , Check the existing data before the system failure , Is it still available after recovery .
- Collect the logs .
5、 Stability and performance test
The following table describes the purpose of the stability and performance tests 、 Test scenarios and indicators .

Stability tests and performance tests share the same set of workflows :

Workflow of stability test and performance test
- Parsing and updating the configuration , And define indicators .server-configmap Corresponding Milvus Stand alone or cluster configuration , and client-configmap Various configurations corresponding to test cases .
- Configure the server and client .
- Prepare the data .
- Request the interaction between the server and the client .
- Report and display indicators .
PART 05
Improve QA Tools and methods for efficiency
It can be seen from the module test section , Most of the testing processes are similar , The main thing is to modify Milvus Server and client configuration , Pass on API Parameters . When there are multiple configurations , The more diversified the combination of different configurations , Experiments and tests can cover a wider range of scenarios . therefore , Code and program reuse , It is very important to improve test efficiency .
1、SDK The test framework

SDK The test framework
To speed up the testing process , We can add one to the original test framework API_request Wrappers , And according to API Gateway settings . Such kind API The gateway will be responsible for collecting all API request , Then pass them on to Milvus, In order to collectively receive responses , And pass it back to the client . This design enables the capture of log information such as parameters and return results , It's easier . Besides ,SDK Test framework checker Components can also be validated and checked Milvus Result . All inspection methods can be found in this checker Defined in the component .
Use SDK The test framework , We can also initialize some key processes , Encapsulate into a function , To cut down a lot of tedious code . It's also worth noting , Each individual test case is related to its unique set , This ensures data isolation . for example , When executing test cases ,pytest-xdist You can use pytest An extension of , Execute all individual test cases in parallel , So it's a big boost to efficiency .
2、GitHub Action

GitHub Action
GitHub Action Because of the following characteristics , Used to improve QA efficiency :
- It is related to GitHub Deeply integrated native CI/CD Tools .
- Machine environment with unified configuration , It is pre installed with Docker、Docker Compose And other commonly used software development tools .
- Its support includes Ubuntu、MacOs、 as well as Windows-server Various operating systems and versions including .
- It has a market that offers rich extensions and out of the box functionality .
- Its matrix can support concurrent jobs , And reuse the same test process , To improve efficiency .
In addition to the above characteristics , use GitHub Action Another reason is that deployment testing and reliability testing require separate isolation environments , and GitHub Action It is very suitable for routine inspection of small data sets .
3、 Benchmarking tools
In order to make QA Testing is more effective , We can use a variety of tools .

Overview of benchmarking tools
- Argo: It's an open source set Kubernetes Tools , Can be used to run workflows , And manage the cluster by scheduling tasks . meanwhile , It can also enable multiple tasks in parallel .
- Kubernetes instrument panel : Offer based on Web Of Kubernetes The user interface , Can be used to visualize server-configmap and client-configmap.
- Network attached storage is a file level data storage server , Can be used to save common ANN-benchmark Data sets .
- InfluxDB and MongoDB: A database that can be used to store benchmark results .
- Grafana: It can be used to monitor server resource indicators , And client performance metrics .
- Redash: Is a tool that can visualize data , And create charts for benchmarking Services .
边栏推荐
- Nine good programming habits for 10 years
- What you must understand before you are 30
- Tasks and responsibilities of the test team and basic concepts of testing
- 如何给VR全景作品添加遮罩?作用是什么?
- A simple scientific research secret
- polardbx是pg还是mysql?
- Performance of recommender algorithms on top-N recommendation tasks
- client-go gin的简单整合九-Create
- Some common SQL (version 05 and above) database maintenance scripts
- VCIP2021:利用解码信息进行超分辨率
猜你喜欢

史蒂芬·柯维写给年轻人的高效工作秘笈

机器人方向的刚性需求→个人思考←

Leetcode math problems

Leetcode subsequence / substring problem

3dMax建模笔记(一):介绍3dMax和创建第一个模型Hello world

How to add a mask to a VR panoramic work? What is the function?

transformers VIT图像模型向量获取

How to understand fold change? Multiple analysis?

hw在即,你还不会看危险报文?

Docker installing PostgreSQL
随机推荐
别再用 System.currentTimeMillis() 统计耗时了,太 Low,StopWatch 好用到爆!
How to add a mask to a VR panoramic work? What is the function?
What does Huawei's minutes on patents say? (including Huawei's top ten inventions)
力扣每日一练之双指针2Day9
聊一聊数据库的行存与列存
Getting started with go web programming: validators
Leetcode math problems
Leetcode interval DP
CVE-2022-22965複現
Query escape in Oracle expdp export
Oceanbase database helps the ideal automobile intelligent production line to realize automatic recovery within 30 seconds
Record of problems encountered in dual network card binding
【云原生】Nacos中的事件发布与订阅--观察者模式
5G时代,如何打造一场令人惊叹的VR直播活动?
client-go gin的简单整合九-Create
谈谈人生风控
JS advanced programming version 4: learning iterators
BSN发展联盟理事长单志广:DDC可为中国元宇宙产业发展提供底层支撑
数据库 就业咨询系统求各位帮下忙
史蒂芬·柯维写给年轻人的高效工作秘笈