XPath

XPath uses path expressions to select nodes or node-sets in an XML
document. The node is selected by following a path or steps.

The XML Example Document

We will use the following XML document in the examples below.

  Harry Potter   29.99   Learning XML   39.95

Selecting Nodes

XPath uses path expressions to select nodes in an XML document. The node
is selected by following a path or steps. The most useful path
expressions are listed below:


Expression Description


nodename Selects all nodes with the name “nodename

/ Selects from the root node

// Selects nodes in the document from the current node
that match the selection no matter where they are

. Selects the current
node(可防止嵌套XPath时拿到上一级)

.. Selects the parent of the current node

@ Selects attributes

In the table below we have listed some path expressions and the result
of the expressions:

+—————-+—————————————————–+
| Path | Result |
| Expression
| |
+================+=====================================================+
| bookstore | Selects all nodes with the name “bookstore” |
+—————-+—————————————————–+
| /bookstore | Selects the root element bookstore |
| | |
| | Note: If the path starts with a slash ( / ) it |
| | always represents an absolute path to an element! |
+—————-+—————————————————–+
| bookstore/book | Selects all book elements that are children of |
| | bookstore |
+—————-+—————————————————–+
| //book | Selects all book elements no matter where they are |
| | in the document |
+—————-+—————————————————–+
| b | Selects all book elements that are descendant of |
| ookstore//book | the bookstore element, no matter where they are |
| | under the bookstore element |
+—————-+—————————————————–+
| //@lang | Selects all attributes that are named lang |
+—————-+—————————————————–+

Predicates

Predicates are used to find a specific node or a node that contains a
specific value.

Predicates are always embedded in square brackets.

In the table below we have listed some path expressions with predicates
and the result of the expressions:

+—————————+——————————————+
| Path Expression | Result |
+===========================+==========================================+
| /bookstore/book[1] | Selects the first book element that is |
| | the child of the bookstore element. |
| | |
| | Note: In IE 5,6,7,8,9 first node |
| | is[0], but according to W3C, it is |
| | [1]. To solve this problem in IE, set |
| | the SelectionLanguage to XPath: |
| | |
| | In JavaScript: |
| | xml
.setPr |
| | operty(“SelectionLanguage”,”XPath”); |
+—————————+——————————————+
| /bookstore/book[last()] | Selects the last book element that is |
| | the child of the bookstore element |
+—————————+——————————————+
| /b | Selects the last but one book element |
| ookstore/book[last()-1] | that is the child of the bookstore |
| | element |
+—————————+——————————————+
| /bookst | Selects the first two book elements that |
| ore/book[position()<3] | are children of the bookstore element |
+—————————+——————————————+
| //title[@lang] | Selects all the title elements that have |
| | an attribute named lang |
+—————————+——————————————+
| //title[@lang=’en’] | Selects all the title elements that have |
| | a “lang” attribute with a value of |
| | “en” |
+—————————+——————————————+
| /book | Selects all the book elements of the |
| store/book[price>35.00] | bookstore element that have a price |
| | element with a value greater than 35.00 |
+—————————+——————————————+
| /bookstore/ | Selects all the title elements of the |
| book[price>35.00]/title | book elements of the bookstore element |
| | that have a price element with a value |
| | greater than 35.00 |
+—————————+——————————————+

Selecting Unknown Nodes

XPath wildcards can be used to select unknown XML nodes.


Wildcard Description


  •            Matches any element node
    

@* Matches any attribute node

node() Matches any node of any kind

In the table below we have listed some path expressions and the result
of the expressions:


Path Result
Expression


/bookstore/* Selects all the child element nodes of the bookstore
element

//* Selects all elements in the document

//title[@*] Selects all title elements which have at least one
attribute of any kind


Selecting Several Paths

By using the | operator in an XPath expression you can select several
paths.

In the table below we have listed some path expressions and the result
of the expressions:


Path Expression Result


//book/title | //book/price Selects all the title AND price elements
of all book elements

//title | //price Selects all the title AND price elements
in the document

/bookstore/book/title | Selects all the title elements of the book
//price element of the bookstore element AND all
the price elements in the document


normalize-space(./span[@class=”app-history-action”])

中的normalize-space 可以等同于innertext()

Xpath判断某个属性是否包含或不包含指定的属性或值

结合Xpath路径来提取循环列表中的一个HTML标签的InnerText,提取的时候需要判断是这个标签的class属性是否包含某个指定的属性值,利用Xpath的contains可以解决,代码如下:

//选择不包含class属性的节点

var result = node.SelectNodes(“.//span[not(@class)]”);

//选择不包含class和id属性的节点

var result = node.SelectNodes(“.//span[not(@class) and
not(@id)]”);

//选择不包含class=”expire”的span

var result =
node.SelectNodes(“.//span[not(contains(@class,’expire’))]”);

//选择包含class=”expire”的span

var result =
node.SelectNodes(“.//span[contains(@class,’expire’)]”);


本博客所有文章除特别声明外,均采用 CC BY-SA 4.0 协议 ,转载请注明出处!