XPath
XPath uses path expressions to select nodes or node-sets in an XML
document. The node is selected by following a path or steps.
The XML Example Document
We will use the following XML document in the examples below.
Selecting Nodes
XPath uses path expressions to select nodes in an XML document. The node
is selected by following a path or steps. The most useful path
expressions are listed below:
Expression Description
nodename Selects all nodes with the name “nodename“
/ Selects from the root node
// Selects nodes in the document from the current node
that match the selection no matter where they are
. Selects the current
node(可防止嵌套XPath时拿到上一级)
.. Selects the parent of the current node
@ Selects attributes
In the table below we have listed some path expressions and the result
of the expressions:
+—————-+—————————————————–+
| Path | Result |
| Expression | |
+================+=====================================================+
| bookstore | Selects all nodes with the name “bookstore” |
+—————-+—————————————————–+
| /bookstore | Selects the root element bookstore |
| | |
| | Note: If the path starts with a slash ( / ) it |
| | always represents an absolute path to an element! |
+—————-+—————————————————–+
| bookstore/book | Selects all book elements that are children of |
| | bookstore |
+—————-+—————————————————–+
| //book | Selects all book elements no matter where they are |
| | in the document |
+—————-+—————————————————–+
| b | Selects all book elements that are descendant of |
| ookstore//book | the bookstore element, no matter where they are |
| | under the bookstore element |
+—————-+—————————————————–+
| //@lang | Selects all attributes that are named lang |
+—————-+—————————————————–+
Predicates
Predicates are used to find a specific node or a node that contains a
specific value.
Predicates are always embedded in square brackets.
In the table below we have listed some path expressions with predicates
and the result of the expressions:
+—————————+——————————————+
| Path Expression | Result |
+===========================+==========================================+
| /bookstore/book[1] | Selects the first book element that is |
| | the child of the bookstore element. |
| | |
| | Note: In IE 5,6,7,8,9 first node |
| | is[0], but according to W3C, it is |
| | [1]. To solve this problem in IE, set |
| | the SelectionLanguage to XPath: |
| | |
| | In JavaScript: |
| | xml.setPr |
| | operty(“SelectionLanguage”,”XPath”); |
+—————————+——————————————+
| /bookstore/book[last()] | Selects the last book element that is |
| | the child of the bookstore element |
+—————————+——————————————+
| /b | Selects the last but one book element |
| ookstore/book[last()-1] | that is the child of the bookstore |
| | element |
+—————————+——————————————+
| /bookst | Selects the first two book elements that |
| ore/book[position()<3] | are children of the bookstore element |
+—————————+——————————————+
| //title[@lang] | Selects all the title elements that have |
| | an attribute named lang |
+—————————+——————————————+
| //title[@lang=’en’] | Selects all the title elements that have |
| | a “lang” attribute with a value of |
| | “en” |
+—————————+——————————————+
| /book | Selects all the book elements of the |
| store/book[price>35.00] | bookstore element that have a price |
| | element with a value greater than 35.00 |
+—————————+——————————————+
| /bookstore/ | Selects all the title elements of the |
| book[price>35.00]/title | book elements of the bookstore element |
| | that have a price element with a value |
| | greater than 35.00 |
+—————————+——————————————+
Selecting Unknown Nodes
XPath wildcards can be used to select unknown XML nodes.
Wildcard Description
Matches any element node
@* Matches any attribute node
node() Matches any node of any kind
In the table below we have listed some path expressions and the result
of the expressions:
Path Result
Expression
/bookstore/* Selects all the child element nodes of the bookstore
element
//* Selects all elements in the document
//title[@*] Selects all title elements which have at least one
attribute of any kind
Selecting Several Paths
By using the | operator in an XPath expression you can select several
paths.
In the table below we have listed some path expressions and the result
of the expressions:
Path Expression Result
//book/title | //book/price Selects all the title AND price elements
of all book elements
//title | //price Selects all the title AND price elements
in the document
/bookstore/book/title | Selects all the title elements of the book
//price element of the bookstore element AND all
the price elements in the document
normalize-space(./span[@class=”app-history-action”])
中的normalize-space 可以等同于innertext()
Xpath判断某个属性是否包含或不包含指定的属性或值
结合Xpath路径来提取循环列表中的一个HTML标签的InnerText,提取的时候需要判断是这个标签的class属性是否包含某个指定的属性值,利用Xpath的contains可以解决,代码如下:
//选择不包含class属性的节点
var result = node.SelectNodes(“.//span[not(@class)]”);
//选择不包含class和id属性的节点
var result = node.SelectNodes(“.//span[not(@class) and
not(@id)]”);
//选择不包含class=”expire”的span
var result =
node.SelectNodes(“.//span[not(contains(@class,’expire’))]”);
//选择包含class=”expire”的span
var result =
node.SelectNodes(“.//span[contains(@class,’expire’)]”);
本博客所有文章除特别声明外,均采用 CC BY-SA 4.0 协议 ,转载请注明出处!