当前位置:网站首页>XML file usage and parsing
XML file usage and parsing
2022-07-28 03:56:00 【Program three two lines】
One 、 summary
Xml and html It's all a markup language , however xml Is an extensible markup language , It is extensible , about html All marks, such as ,<a> Represents a connection . All have a certain meaning , You can't define some labels by yourself , however xml You can customize some labels . For data transfer
<?xml version="1.0" encoding="UTF-8"?>
<!-- The above sentence is xml documentation Must be on the first line -->
<!-- every last xml There is a root tag that contains child tags And it is divided into case -->
<goodlist>
<!-- Each sub tag has its own attributes such as id attribute -->
<good id="111">
<name>apple</name>
<place>beijing</place>
</good>
<good>
<name>banana</name>
<place>shanghei</place>
</good>
</goodlist>Two 、 Constraint file
Usually written xml file , There are no fixed rules , Tag name Property name Attribute values can be defined at will , But if you want others to use yourself xml The use of documents , Relevant labels cannot be defined randomly , It's here xml Add constraint file to the file , There are two common constraint files DTD( The file name suffix is dtd) and schema( The file name suffix is xsd) Two kinds of , By comparison ,schema Constraint files are more advanced and comprehensive .DTD Constraints and schema constraint ( It's the same thing It's just that it's powerful and different )
Location
Inside dtd
external dtd
3、 ... and 、 analysis
summary
1. JAXP:sun Company supplied parsers , Support dom and sax Two thoughts
2. DOM4J: A very good parser
3. Jsoup:jsoup Is a Java Of HTML Parser , Can directly parse a URL Address 、HTML Text content . It provides a very labor-saving API, It can be done by DOM,CSS And similar to jQuery To extract and manipulate data .
4. PULL:Android The built-in parser of the operating system ,sax The way of .
DOM4J analysis
The way is dom The way , hold xml The file is loaded into memory to form a dom Trees



And get a document object . have access to dom4j Conduct dom Analysis of the way
take Dom4j Of jar Put the package in the project root directory and create a folder as lib(javase engineering );web The project is directly put into WEB-INF Medium lib file

You can also use dom4j Generate xml file

perhaps

result

jsoup analysis
summary
Is a Java Of HTML Parser , Can directly parse a URL Address 、HTML Text content . It provides a very labor-saving API, It can be done by DOM,CSS And similar to jQuery To extract and manipulate data .
* step :
1. Import jar package
2. obtain Document object
3. Get the corresponding label Element object
4. get data
Case code
// obtain student.xml Of path
String path = JsoupDemo1.class.getClassLoader().getResource("student.xml").getPath();
// analysis xml file , Load document into memory , obtain dom Trees --->Document
Document document = Jsoup.parse(new File(path), "utf-8");
// Get element object Element
Elements elements = document.getElementsByTag("name");
System.out.println(elements.size());
// Get the first one name Of Element object
Element element = elements.get(0);
// get data
String name = element.text();
System.out.println(name);Using document
1. Jsoup: Tool class , Can be parsed html or xml file , return Document
* parse: analysis html or xml file , return Document
* parse(File in, String charsetName): analysis xml or html Of documents .
* parse(String html): analysis xml or html character string
* parse(URL url, int timeoutMillis): Get the specified... Through the network path html or xml Document object for
2. Document: Document object . Represents... In memory dom Trees
* obtain Element object
* getElementById(String id): according to id Property value gets unique element object
* getElementsByTag(String tagName): Get the collection of element objects according to the label name
* getElementsByAttribute(String key): Get the collection of element objects according to the attribute name
* getElementsByAttributeValue(String key, String value): Get the element object set according to the corresponding attribute name and attribute value
3. Elements: Elements Element A collection of objects . Can be regarded as ArrayList<Element> To use
4. Element: Element object
Get child element object
* getElementById(String id): according to id Property value gets unique element object
* getElementsByTag(String tagName): Get the collection of element objects according to the label name
* getElementsByAttribute(String key): Get the collection of element objects according to the attribute name
* getElementsByAttributeValue(String key, String value): Get the element object set according to the corresponding attribute name and attribute value
Get attribute value
* String attr(String key): Get the property value according to the property name
Get text content
* String text(): Get text content
* String html(): Get all the contents of the label body ( Include the string content of the word tag )
5. Node: Node object
* yes Document and Element Parent class of
*6. Quick query
selector: Selectors
* Method used :Elements select(String cssQuery)
* grammar : Reference resources Selector Syntax defined in class
XPath:XPath That is to say XML Path to the language , It's a way to determine XML( A subset of Standard General Markup Languages ) The language of a part of a document
* Use Jsoup Of Xpath Need extra import jar package .
* Inquire about w3cshool Reference manual , Use xpath The syntax of complete query
* Code :
//1. obtain student.xml Of path
String path = JsoupDemo6.class.getClassLoader().getResource("student.xml").getPath();
//2. obtain Document object
Document document = Jsoup.parse(new File(path), "utf-8");
//3. according to document object , establish JXDocument object
JXDocument jxDocument = new JXDocument(document);combination xpath Syntax query
// Query all student label
List<JXNode> jxNodes = jxDocument.selN("//student");
for (JXNode jxNode : jxNodes) {
System.out.println(jxNode);
}
System.out.println("--------------------");
// Query all student Label under name label
List<JXNode> jxNodes2 = jxDocument.selN("//student/name");
for (JXNode jxNode : jxNodes2) {
System.out.println(jxNode);
}
System.out.println("--------------------");
// Inquire about student There is... Under the label id Attribute name label
List<JXNode> jxNodes3 = jxDocument.selN("//student/name[@id]");
for (JXNode jxNode : jxNodes3) {
System.out.println(jxNode);
}
System.out.println("--------------------");
// Inquire about student There is... Under the label id Attribute name label also id The property value is itcast
List<JXNode> jxNodes4 = jxDocument.selN("//student/name[@id='itcast']");
for (JXNode jxNode : jxNodes4) {
System.out.println(jxNode);
}边栏推荐
- Common weak network testing tools
- A 404 page source code imitating win10 blue screen
- Build an "industrial brain" and improve the park's operation, management and service capabilities with "digitalization"!
- Leetcode58. 最后一个单词的长度
- 常用的接口测试工具
- Greed - 55. Jumping game
- xml文件使用及解析
- Input upload file and echo FileReader and restrict the type of file selection
- Qt:qmessagebox message box, custom signal and slot
- I did these three things before the interview, and the result was actually direct
猜你喜欢

Mysql基础篇(创建、管理、增删改表)

Simple and easy-to-use performance testing tools recommended

一名合格的软件测试工程师,应该具备哪些技术能力?

高等数学(第七版)同济大学 习题3-6 个人解答

A 404 page source code imitating win10 blue screen

Leetcode58. Length of the last word

C语言:求一个整数存储在内存中的二进制中1的个数
![[leetcode] 34. Find the first and last positions of elements in the sorted array](/img/f0/3eaa33fa7b13abe5f27b136239507d.png)
[leetcode] 34. Find the first and last positions of elements in the sorted array

leetcode刷题:动态规划08(分割等和子集)

基于SSM实现在线租房系统
随机推荐
高等数学(第七版)同济大学 习题3-6 个人解答
Input upload file and echo FileReader and restrict the type of file selection
Data mining-02
Dynamic programming - 509. Fibonacci number
STC timer is abnormal (how to modify the initial value, the timing time is 100ms)
Ch340 RTS DTR pin programming drives OLED
TypeError: ufunc ‘bitwise_ and‘ not supported for the input types, and the inputs could not be safely
Dynamic programming - 416. Segmentation and subsets
Detailed explanation of string + memory function (C language)
【图像分类】2021-MLP-Mixer NIPS
leetcode刷题:动态规划09(最后一块石头的重量 II)
Developing rc522 module based on c8t6 chip to realize breathing lamp
Selenium -- Web automated testing tool
Greed 122. The best time to buy and sell stocks II
Which stock exchange has the lowest commission? Is it safe to open an account on your mobile phone
[prototype and prototype chain] get to know prototype and prototype chain~
test case management tool
Advanced Mathematics (Seventh Edition) Tongji University exercises 3-6 personal solutions
CANopen learning notes
A 404 page source code imitating win10 blue screen