DR. AJAY KUMAR PATHAK
ASSISTANT PROFESSOR
READ ALL THE NOTES CHAPTER WISE
SUBJECT NAME:- MJ–12 (Th):- WEB PROGRAMMING
FOR B. Sc. IT.
SEM 6 F.Y.U.G.P.
SUBJECT : MJ–12 (Th): WEB PROGRAMMING
(To be selected by the students from)
UNIT 2 (UNIT NAME):- UNDERSTANDING XML
Objective: The objective of the course is to enable students to
· The objective of this course is to provide students with a comprehensive understanding of network security concepts and techniques. The course aims to develop students' skills in identifying network vulnerabilities, implementing security measures, and ensuring the confidentiality, integrity, and availability of networked systems.
Learning Outcome:- After completion of this course, a student will be able to–
· Understand the principles and concepts of network security.
· Identify potential security threats and vulnerabilities in networked systems.
· Implement security measures to protect network infrastructure.
· Apply encryption and authentication techniques to secure network communication.
· Analyze and respond to security incidents in networked environments
Semester Examination and Distribution of Marks
INTERNAL MARKS :- 25 (NO PRACTICAL IN THE MJ 12 (WEB PROGRAMMING)
End Semester Examination (ESE) : 75 Marks
-: NOTES
READ FROM HERE :-
SYLLABUS OF UNIT 2 :- Understanding XML:
Overview of XML, Creating XML Documents, Rules for Well-Formed XML, Adding
Comments, CDATA Sections, Creating a DTD-The Concept of a Valid XML Document,
Creating a DTD for an existing XML File.
UNIT- 2
:- UNDERSTANDING XML:-
XML
stands for extensible markup language. A markup language is a set of codes, or
tags that describes the text in a digital document XML is a markup language
similar to HTML XML tags are not predefined. You must define your own tags. Instead,
you define your own tags designed specifically for your needs. This is a
powerful way to store data in a format that can be stored, searched, transport
and shared. XML is platform independent and language independent
For
example, Microsoft Office versions 2007 and later use XML for their document
structure. So, when you save a document in Word, Excel or PowerPoint, you get
the document title with an "X" at the end, which stands for XML. For
a Word document, the title appears with ".DOCX" at the end.
Elements trees are used to
create XML documents.
An XML tree begins with a
root element and branches to child elements.
All elements can have child
elements (sub-elements):
<root>
<child>
<subchild>…..</subchild>
</child>
</root>
<!-- This is a comment
that spans
multiple lines. -->
<!-- This is a comment
that Single lines. -->
<?xml version =
"1.0"?>
<contact-info>
<name>Ajay Kumar Pathak
</name>
<company>You are
learning XML Program</company>
<phone>1234567890</phone>
</contact-info>
Markup, as <contact-info>
The text, or the character
information, Ajay Kumar Pathak and 1234567890.
SIMPLE EXAMPLE OF A XML
PROGRAM
<?xml version =
"1.0"encoding="UTF-8"?> ///The prolog specifies that the
file contains XML version 1.0 data, encoded using Unicode Transformation Format
8 (UTF-8) encoding, which is the same as ASCII text and specifies the American
English character set.
<Company>
<Employee>
<FirstName>AJAY</FirstName>
<MiddleName>KUMAR</MiddleName>
<LastName>PATHAK</LastName>
<ContactNo>1234567890</ContactNo>
<Email>abcd@xyz.com</Email>
<Address>
<City>Jamshedpur</City>
<State>Jharkhand</State>
<Zip>831002</Zip>
</Address>
</Employee>
</Company>
RULES:-An XML declaration should abide with the following rules −
1) If the XML declaration is present in the XML, it must be placed as the first line in the XML document.
2) If the XML declaration is included, it must contain version number attribute.
3) The Parameter names and values are case-sensitive.
4) The names are always in lower case.
5) The order of placing the parameters is important. The correct order is: version, encoding and standalone.
6) Either single or double quotes may be used.
7) The XML declaration has no closing tag i.e. </?xml>
STRUCTURE OF AN XML DOCUMENT:
An XML Document contains some
attributes.
(1)..XMLDeclaration/
definition :-The XML Declaration provides basic information about the format
for the rest of the XML document. It takes the form of a Processing Instruction
and can have the attributes version, encoding and standalone. It is optional,
but when used, it must appear in the first line of the XML document.
EXAMPLE:-<?xml
version="1.0" encoding="utf-8"?>/// IT’S ALSO COMMENT
LINE Full form , Unicode Transformation FormatUTF-8 is an encoding system for
Unicode. It can translate any Unicode character to a matching unique binary
string, and can also translate the binary string back to a Unicode character.///
(2).. Elements :- XML elements can be defined as building
blocks of an XML. Elements can behave as containers to hold text, elements,
attributes, media objects or all of these.
Each XML document contains
one or more elements, the scope of which are either enclosed by start and end
tags, or for empty elements, by an empty-element tag.
Syntax
Following is the syntax to
write an XML element −
Examples :- Empty Element
<MyElement/>,
Example:- Element with no
content
<MyElement></MyElement>
Example:- Element with text
<MyElement>Some
Text</MyElement>
(3).. Text :-In XML text is the content enclosed within
elements. XML is designed to represent structured data, and text within
elements serves as the data that you want to store or transmit.
Example:<book>
<title>XML
Basics</title>
<author>Ajay Pathak
</author>
<published>2023</published>
</book> ,
In this example, the
<title>, <author>, and <published> elements contain text
content, which is the data associated with those elements.
(4).. Attributes :- Attributes are used to provide additional
information about an element. Attributes are typically name-value pairs that
are placed within the start tag of an element. (Value has to be in double
(" ") or single (' ') quotes. Here, attributeName is unique attribute
labels.)
Example: -
<elementNameattributeName="attributeValue">elementContent</elementNam
Let's break down the parts:
<elementName>: This is
the name of the XML element.
attributeName: This is the
name of the attribute.
attributeValue: This is the
value associated with the attribute.
elementContent: This is the
content of the XML element.
Example:
<person
firstName="Ajay" lastName="Pathak" age="45">
<address>
<street>123
Bistupur</street>
<city>Jamshedpur</city>
</address>
</person>
We have an XML element named
"person" with three attributes: "firstName,"
"lastName," and "age." These attributes provide information
about the person, such as their first name, last name, and age
(5). Entities :- Entities in XML have a similar role to
variables in other programming languages. A variable is a storage location in
programming. A variable is used as a storage container which instead of using
the value or explicitly using something you can store it in a container and
continue to use it frequently throughout your code.
Entities are used to define
shortcuts to special characters within the XML documents. Entities can be
primarily of four types −
1..Built-in entities
2..Character entities
3..General entities
4..Parameter entities
In general, entities can be
declared internally or externally. Let us understand each of these and their
syntax as follows −
Syntax
Following is the syntax for
internal entity declaration −
<!ENTITYentity_name
"entity_value">
In the above syntax −
entity_name is the name of
entity followed by its value within the double quotes or single quote.
entity_value holds the value
for the entity name.
The entity value of the
Internal Entity is de-referenced by adding prefix & to the entity name i.e.
&entity_name.
Example
[
<!ELEMENT address
(#PCDATA)> // parsed character data//
<!ENTITY name
"Tanmaypatil">
<!ENTITY company "AjayTutorials">
<!ENTITYphone_no "014154234567">
]>
<address>
&name;
&company;
&phone_no;
</address>
In the above example, the
respective entity names name, company and phone_no are replaced by their values
in the XML document. The entity values are de-referenced by adding prefix &
to the entity name.
Following is the syntax for
External Entity declaration −
<!ENTITY name SYSTEM "URI/URL">
In the above syntax −
XML NAMING RULES
XML elements must follow these naming rules:
1) Element names are case-sensitive
2) Element names must start with a letter or underscore
3) Element names cannot start with the letters xml (or XML, or Xml, etc)
4) Element names can contain letters, digits, hyphens, underscores, and periods
5) Element names cannot contain spaces
6) Any name can be used, no words are reserved (except xml).
XML Advantages
1) XML is extendable.
2) Can be read and understood by all.
3) Completely portable and also compatible with JAVA.
4) XML is a platform-independent programming language; hence can be used by any system.
5) XML supports Unicode
6) Using XML, data can be stored and transported at any point in time without affecting data presentation.
7) XML document is free of any syntax error.
8) Data sharing between various systems is simplified using XML.
XML Disadvantages
1) Compared to other text-based formats, XML is redundant and verbose.
2) When data volume is large, it results in high storage and transportation cost due to redundancy in XML syntax.
3) Compared to other text-based formats, XML is less readable.
4) Due to its lengthy nature, the XML file size is very large.
5) XML does not support an array.
DEFERENCE BETWEEN
XML AND HTML
|
HTML |
XML |
|
Is
a markup language. |
Is a standard markup
language that defines other markup languages. |
|
Is
not case sensitive. |
Is case sensitive. |
|
Doubles
up as a presentation language. |
Is not a presentation
language nor a programming language. |
|
Has
its own predefined tags. |
Tags are defined as per the
need of the programmer. XML is flexible as tags can be defined when needed. |
|
Closing
tags are not necessarily needed. |
Closing tags are used
mandatorily. |
|
White
spaces are not preserved. |
Capable of preserving white
spaces. |
|
Showcases
the design of a web page in the way it is displayed on client-side. |
Enables transportation of
data from database and related applications. |
|
Used
for displaying data. |
Used for transferring data. |
|
Static
in nature. |
Dynamic in nature. |
|
Offers
native support. |
With the help of elements
and attributes, objects are expressed by conventions. |
|
Null
value is natively recognized. |
Xsi:nil on elements is
needed in an XML instance document. |
|
Extra
application code is not needed to parse text. |
XML DOM (Document Object
Model )application and implementation code is needed to map text back into
JavaScript objects. |
EXAMPLE <!DOCTYPE html><html><head><title> Page title </title></head><body><hl> First Heading</hl><p> First paragraph.</p></body></html> |
<?xml version="1.0> <address><name> AJAY KUMAR
PATHAK</name><contact>1234567890</contact><email>ajay1@gmail.com
</email><birthdate>1980-09-27</birthdate></address> |
VVI :XML DOES NOT
DO ANYTHING
Maybe it is a little hard to
understand, but XML does not DO anything.
<note>
<to>AJAY</to>
<from>JAMSHEDPUR</from>
<heading>THIS IS 1ST
XML FILE</heading>
<body>WELCOME TO ALL
</body>
</note>
The XML above is quite
self-descriptive:
1. It has sender information
2. It has receiver information
3. It has a heading
4. It has a message body
But still, the XML above does
not DO anything. XML is just information wrapped in tags.
CREATING XML DOCUMENTS : -
STEPS:-
1. Open your text editor of choice.
2. On the first line, write an XML declaration.
3. Set your root element below the declaration.
4. Add your child elements within the root element.
5. Review your file for errors.
6. Save your file with the .xml file extension.
7. Test your file by opening it in the browser window.
EXAMPLE WITH
QUESTIONS
EXAMPLE 1: SIMPLE XML FOR
BOOK LIST (LIB1 . XML)
<?xml
version="1.0" encoding="UTF-8"?> <!—THIS IS DOCUMENT
PROLOG, This is an XML file containing BOOKS records -->
<library> <!—FROM HERE TO LAST LINES (< / library> ) ARE CALLED
DOCUMENTS ELEMENTS
<book id="101">
<title>XML Basics</title>
<author>DR. AJAY KR
PATHAK</author>
<price>600.50</price>
</book>
<book id="102">
<title>Learning
Python</title>
<author>MIHIR KUMAR
</author>
<price>600.00</price>
</book>
</library>
Explanation of the Above Code
- <library> is the root element.
- <book> is a child element and has an attribute id.
- Inside each <book> are nested tags:
<title>, <author>, <price>.
This will create an BOOK.xml
file:
/// ( ( Document Prolog Section :- Document Prolog
comes at the top of the document, before the root element. This section contains −XML
declaration Document type declaration
) ) ///
END
EXAMPLE NO 2 :- CREATING
XML DOCUMENTS FILE OF LIBRARY ( LIB2 . XML)
<?xml version='1.0'
encoding='utf-8'?>
<library>
<book id="B001">
<title>The Guide</title>
<author>R.K.
Narayan</author>
<genre>Fiction</genre>
<price>399.99</price>
<year>1958</year>
</book>
<book id="B002">
<title>Wings of
Fire</title>
<author>A.P.J. Abdul
Kalam</author>
<genre>Autobiography</genre>
<price>499.50</price>
<year>1999</year>
</book>
<book id="B003">
<title>Train to
Pakistan</title>
<author>Khushwant
Singh</author>
<genre>Historical
Fiction</genre>
<price>299.00</price>
<year>1956</year>
</book>
<book id="B004">
<title>God of Small
Things</title>
<author>Arundhati
Roy</author>
<genre>Literary
Fiction</genre>
<price>450.00</price>
<year>1997</year>
</book>
</library>
EXAMPLE 2: USING HTML AND
XSLT (ALTERNATE SOLUTION OF JAVA SCRIPT)
Since
you don’t want to use JavaScript and prefer only HTML and XML,
the best way to display an XML file in a browser is by using XSLT (Extensible Stylesheet Language
Transformations).
STEP
1: CREATE AN XML FILE (BOOKS.XML)
<?xml
version="1.0" encoding="UTF-8"?>
<?xml-stylesheet
type="text/xsl" href="books.xsl"?> //// The second line
links to an XSLT file (books.xsl) that will
format the XML for display.////
<library>
<book>
<title>WEB TECHNOLOGY </title>
<author> DR. AJAY KUMAR PATHAK </author>
<price>800</price>
</book>
<book>
<title>The Guide</title>
<author>DR. AJAY KUMAR PATHAK </author>
<price>800</price>
</book>
</library>
STEP 2: CREATE
AN XSLT FILE (BOOKS.XSL)
THIS FILE FORMATS AND
DISPLAYS THE XML DATA IN A TABLE WHEN OPENED IN A BROWSER.
<?xml
version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
version="1.0" xmlns:xsl="http://www.ajay.org/2025/XSL/Transform">
////<xsl:stylesheet> → Defines an XSLT file ////
<xsl:template match="/">
///xsl:template
match="/"> → Starts
formatting the entire XML document. ///
<html>
<head>
<title>Library
Books</title>
</head>
<body>
<h2>Library Books</h2>
<table border="1">
<tr>
<th>Title</th>
<th>Author</th>
<th>Price</th>
</tr>
<xsl:for-each
select="library/book">
<tr>
<td><xsl:value-of select="title"/></td>
<td><xsl:value-of select="author"/></td>
<td><xsl:value-of select="price"/></td>
</tr>
</xsl:for-each>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
Step 3: How to Run the Files
- Save both files
(books.xml and books.xsl) in the same folder.
- Open books.xml in your
browser (Chrome, Edge, Firefox).
- The book list will be
displayed as a table.
Final Output in Browser
|
Title |
Author |
Price |
|
WEB
TECHNOLOGY |
DR. AJAY KUMAR PATHAK |
800 |
|
THE GUIDE |
DR. AJAY KUMAR PATHAK |
800 |
THE END
RULES
FOR WELL-FORMED XML : - When a document follows the
XML markup syntax rules, it is said to be well-formed . Documents that have
incorrect syntax are referred to as malformed .
Well-Formed
XML
The primary rules for a well-formed XML document are:
- There may be no whitespace (character spaces or
line returns) before the XML declaration, if there is one.
- There is a single "root" element that
contains all the other elements.
- An element must have both an opening and closing
tag, unless it is an empty element.
- If an element is empty, it must contain a closing
slash before the end of the tag (for example, <br/>).
- All opening and closing tags must nest correctly
and not overlap.
- There may not be whitespace between the
opening < and the element name in a tag.
- All element attribute values must be in straight
quotation marks (either single or double quotes).
- An element may not have two attributes with the
same name.
- Comments and processing instructions may not
appear inside tags.
- No
unescaped < or & signs may occur in the
character data of an element or attribute.
- The document must have a single root element, a
unique element that encloses the entire document. The root element may be
used only once in the document.
- The element tags are case-sensitive; the
beginning and end tags must match exactly. Tag names cannot contain any of
the characters !"#$%&'()*+,/;<=>?@[\]^`{|}~, nor a
space character, and cannot start with -, ., or a numeric digit.
COMPLETE EXAMPLE OF A WELL-FORMED XML
DOCUMENT:-
<?xml version="1.0"
encoding="UTF-8"?>
<school>
<student
id="101">
<name>MOHIT SINHA</name>
<age>20</age>
<email>mohit@gmail.com</email>
</student>
<student
id="102">
<name>SANDIP NANDI</name>
<age>21</age>
<email>sanip@gmail.com</email>
</student>
</school>
THE END
DISCERNING STRUCTURE : -
In
XML, "Discerning Structure" refers to understanding and identifying
the hierarchical organization (Means tree structures) and components of an XML
document. It’s about recognizing how the elements are nested, how they relate
to one another, and how data is represented in a tree-like format.
Key Concepts in Discerning XML Structure:
(NOTE:- An XML document is always descriptive. The
tree structure is often referred to as XML Tree and plays an important role to
describe any XML document easily. The
tree structure contains root (parent) elements, child elements and so on. By
using tree structure, you can get to know all succeeding branches and
sub-branches starting from the root. The parsing starts at the root, then moves
down the first branch to an element, take the first branch from there, and so
on to the leaf nodes )
1.
Root Element:- Every XML document
must have one root element that contains all other elements.
Example: <bookstore> <!--
Other elements inside --> </bookstore>
2. Child Elements (Sub-elements):- Elements nested inside another element.
Example: <book> <title> XML Basics </title>
<author> DR. AJAY KR PATHAK </author>
</book>
3.
Attributes:- Provide additional information about elements.
Example:- <book
category="programming"> <title>XML Guide</title> </book>
4. Hierarchy (Parent-Child Relationship):-
· Shows how elements are nested within each other.
· The bookstore is the parent of book, and book is the parent of title, author, etc.
5. Well-Formedness:
· Proper nesting and closing of tags.
· Example of well-formed:
<note>
<to>Student</to>
<from>Teacher</from>
</note>
6. Comments and Declarations:-
· XML Declaration: <?xml version="1.0" encoding="UTF-8"?>
· Comments: <!-- This is a comment -->
EXAMPLE: DISCERNING STRUCTURE:-
<?xml version="1.0"
encoding="UTF-8"?>
<library>
<book
id="b1">
<title>Learn XML</title>
<author> ANSHU SINGH </author>
<year>2023</year>
</book>
<book
id="b2">
<title>Advanced XML</title>
<author> DR. AJAY KR PATHAK </author>
<year>2024</year>
</book>
</library>
EXPLANATION :-
·
Root Element: library
·
Child Elements of library: book (2
of them)
·
Attributes: id="b1" and
id="b2"
·
Sub-elements of book: title,
author, year
WORKING WITH MIXED CONTENT:-
(Mixed content
in an XML file means an element can contain:
- Text
- Child
elements )
(“Mixed content means combining plain text and XML tags inside the same element.")
Example Program: Let’s say
we want to write a paragraph where some text is bold and some is italic.
<message>
Hello,
<b>this is bold</b> and <i>this is italic</i> text!
</message>
What does it mean?
- Hello, = plain text
- <b>this is bold</b> = bold text
inside child element
- <i>this is italic</i> = italic text
inside child element
- text! = again plain text
So, the <message> element contains text and tags
together mixed content.
Step-by-Step Explanation:
Step 1: Start with XML Declaration
<?xml version="1.0"
encoding="UTF-8"?>
This tells the XML processor about the version and
encoding.
Step 2: Define the Root Element
<message> ... </message>
This is the main element wrapping the entire content.
Step 3: Mix Text and Elements
Inside <message>, we put:
- Normal text: Hello,
- Child element: <b>this is bold</b>
- Child element: <i>this is italic</i>
- Normal text: text!
Full XML Code:
<?xml version="1.0"
encoding="UTF-8"?>
<message>
Hello,
<b>this is bold</b> and <i>this is italic</i> text!
</message>
Important Rules for Mixed
Content:
- Text and elements must be allowed in the element’s structure (in
DTD or schema).
- Whitespace is considered text too.
- Mixed content is common
in document-type XML, not
database-type.
Optional: Sample DTD for Mixed
Content
If you're using a DTD to
validate the XML, define it like this:
<!DOCTYPE message [
<!ELEMENT message (#PCDATA | b | i)*>
<!ELEMENT b (#PCDATA)>
<!ELEMENT i (#PCDATA)>
]>
Here,
- #PCDATA = Parsed Character Data (text)
- | = OR
- * = Zero or more times
So,
message can have text, <b>, or
<i> in any order.
MORE EXAMPLE WITH DEFINITION OF WORKING WITH MIXED CONTENT:-
What
are "Mixed Content" models in XML?
In
XML, "Mixed Content" refers to an element content model that allows
both text and child elements within an element. This means that the element can
contain both character data (i.e. text content) and child elements in any
order. For example, consider an XML element named "paragraph" that
can contain both text and a child element named "emphasis":
<!ELEMENT
paragraph (#PCDATA | emphasis)*>
<!ELEMENT
emphasis (#PCDATA)>
In
this example, the "paragraph" element is defined as having mixed
content by using the asterisk (*) to indicate that any number of occurrences of
either character data or the "emphasis" element can appear in any
order. The "emphasis" element is defined as containing only character
data.
With
this content model, the following are all valid examples of
"paragraph" elements:
<paragraph>This is a simple
paragraph.</paragraph>
<paragraph><emphasis>This</emphasis>
is an <emphasis>important</emphasis> paragraph.</paragraph>
<paragraph><emphasis>This</emphasis>
paragraph contains <emphasis>multiple</emphasis>
emphases.</paragraph>
Mixed
content models can be useful when the content of an element needs to contain
both text and other elements, such as in the case of HTML-like markup languages
or rich-text document formats. However, they can also make it more difficult to
validate and process the XML document, as the order and content of child
elements can vary.
XML Mixed Content
Mixed content models enable you to include both text and element content within a single content model. To create a mixed content model in XML Schemas, simply include the mixed attribute with the value true in your <complexType> definition, like so:
<element name="description">
<complexType mixed="true">
<choice minOccurs="0"
maxOccurs="unbounded">
<element name="em"
type="string"/>
<element name="strong"
type="string"/>
<element name="br"
type="string"/>
</choice>
</complexType>
</element>
The preceding example declares a <description> element, which can contain an infinite number of <em>, <strong>, and <br> elements. Because the complex type is declared as mixed, text can be interspersed throughout these elements. An allowable <description> element might look like the following:
<description> Joe is a developer
</description>
In
this <description> element, textual content is interspersed throughout
the elements declared within the content model. As the schema validator is
processing the preceding example, it skips over the textual content and
entities while performing standard validation on the elements. Because the
elements <em>, <strong>, and <br> may appear repeatedly
(maxOccurs="unbounded"), the example is valid
END
ADDING COMMENTS IN XML :-
In
XML, comments are notes or explanations written within the code that help
humans understand the document but are ignored by the XML parser (i.e., the
software reading the XML).
In XML and in every programming language, comments play an important role because they can be used to explain a particular piece of code. XML comments are a single-character or paragraph statement that, in addition to code, provides formal documentation and helps you understand common tags used in an XML file. Keep in mind that an inline comment should be used carefully, as it will make the code harder to read. Comments are possible anywhere in the XML code. Unlike JSON, XML supports comments out of the box. In XML, comments are identified by the following syntax: "<!--”, and their end is indicated by "-->". You can add a text note between these characters.
Syntax
of XML Comments
<example><!--
This is a comment --></example>
They do not
affect the structure or the data and are often used for:
- Documentation
- Descriptions
- Reminders
- Temporary disabling of code/data
Single-Line
Comment | <!-- This is a comment --> | Simple descriptions
Multi-Line
Comment | <!-- Comment across multiple lines --> | Detailed
notes/documentation
Commented
XML Block | <!-- <tag>data</tag> --> | Temporarily disabling
code/data
Rules
of Writing XML Comments:-
|
1 |
A comment must
start with <!-- and end
with -->. |
||||
|
2 |
Comments cannot
be nested (no comments inside comments). |
||||
|
3 |
The string -- is not allowed inside comments. |
||||
|
4 |
Comments can be placed anywhere in the XML document. |
||||
|
5 |
Comments are not
displayed or processed by the parser. |
||||
Types
of Comment Lines in XML
Technically,
XML supports only one
type of comment syntax,
but based on how and where
comments are used, :
1Single-Line
Comments
2
Multi-Line Comments
EXAMPLE OF THE COMMENTS
<?xml
version="1.0" encoding="UTF-8"?>
<!--
This is the main XML document for student data -->
<students>
<!-- Student 1 Details -->
<student>
<id>101</id>
<name>Ravi Kumar</name>
<course>BCA</course>
</student>
<!-- Student 2 Details -->
<student>
<id>102</id>
<name>Priya Sharma</name>
<course>MCA</course>
</student>
<!-- The following student is on hold,
will be activated later
<student>
<id>103</id>
<name>Rahul Mehta</name>
<course>B.Tech</course>
</student>
-->
</students>
END
WHAT IS XML PARSER:- ( In
simple words: An XML Parser is like a reader or checker that goes through your
XML file to: Read the content , Check if
it's correct (follows XML rules), Understand the structure and data, Think of it like a teacher checking your
exam paper:- If everything follows the rules → then All good!, If there are mistakes like missing tags or
wrong characters → then Error!
Example:-
<person>
<name>AJAY</name>
<age>40</age>
</person>
Invalid XML (Parser will show
error):
<student>
<name>AJAY</grade>
</student>
)
Full
explanation of the parser: -XML parser is a software library or a package that
provides interface for client applications to work with XML documents. It
checks for proper format of the XML document and may also validate the XML
documents. Modern day browsers have built-in XML parsers.
Following diagram shows how XML parser interacts with XML document
To
ease the process of parsing, some commercial products are available that
facilitate the breakdown of XML document and yield more reliable results.
Some
commonly used parsers are listed below :-
- MSXML (Microsoft Core XML Services) − This is a standard set of XML tools from
Microsoft that includes a parser.
- System.Xml.XmlDocument − This class is part of .NET library, which
contains a number of different classes related to working with XML.
- Java built-in parser − The Java library has its own parser. The
library is designed such that you can replace the built-in parser with an
external implementation such as Xerces from Apache or Saxon.
- Saxon −
Saxon offers tools for parsing, transforming, and querying XML.
- Xerces −
Xerces is implemented in Java and is developed by the famous open source
Apache Software Foundation.
CDATA
SECTIONS:- Full form CDATA , Character Data)
(It is also called Unparsed Character data), The term CDATA means, Character
Data. CDATA is defined as blocks of text that are not parsed (passed or read or
recognized ) by the parser, but are otherwise recognized as markup.
The
predefined entities such as &lt ( (less
than(<));, &gt (greater than(>));, and &amp; (< and
> are HTML entities and stand for less than(<) and greater than(>)
respectively. HTML entities are reserved characters that are used to represent
some special characters within HTML ) require typing and are generally difficult to
read in the markup. In such cases, CDATA section can be used. By using CDATA
section, you are commanding the parser that the particular section of the
document contains no markup and should be treated as regular text.
WHY DO WE NEED CDATA? XML doesn't allow certain characters like:
1.. < (less than)
2.. > (greater than)
3.. & (ampersand)
If
we include these characters in XML content, it will cause errors or confusion
for the XML parser. So, we wrap
such content in <![CDATA[ ... ]]> to safely include any text.
CDATA Syntax:
<![CDATA[
Your text here...
]]>
Everything inside <![CDATA[
and ]]> will be treated as text,
not XML code.
EXAMPLE:-
Let’s take a simple XML
example without CDATA and see the problem.
Example Without CDATA
(Problem):
<note>
<message>Use 5 < 10 in your
code</message>
</note>
This will cause an error!
Because < is interpreted
as the start of a new tag.
CORRECT EXAMPLE USING CDATA:
<note>
<message><![CDATA[Use 5 < 10 in
your code]]></message>
</note>
Explanation:
<![CDATA[
starts the CDATA section
Use
5 < 10 in your code is now just plain text
]]>
ends the CDATA section
Now,
XML parser will NOT try to interpret < 10 as a tag.
ANOTHER
REAL-LIFE EXAMPLE
<code>
<![CDATA[
if (x < 5 && y > 10) {
console.log("Hello, world!");
}
]]>
</code>
Here:
- <, >, &&, and other symbols will NOT break your XML
- It's a safe way to store code or scripts
Key Points to Remember:
|
Feature |
Description |
|
Start |
<![CDATA[ |
|
End |
]]> |
|
Safe for |
Symbols like <, >, &, etc. |
|
Used for |
Including raw text, code, special characters |
|
Parser Behavior |
Ignores tags or special characters inside CDATA |
Final
XML CDATA Example:
<book>
<title>Learning XML</title>
<description><![CDATA[
This book explains XML basics.
It also includes code like <tag> and
&entity; safely.
]]></description>
</book>
Here, the description contains:
- <tag> (won’t be confused as an XML tag)
- &entity; (won’t be treated as a special
entity)
END
PCDATA IN MXL :- Meaning: It is the text that XML can read, understand,
and process. In XML, when you write normal text inside a tag,
that text is called PCDATA. This data is read and
checked (parsed) by the XML parser. (MEANS IN HINDI JO XML KA PROGRAME
HAM LOG LIKHTE HAI WAHI PLAIN DATA
PCDATA KAHALATA HAI)
Simple Example: <name>AJAY
KUMAR</name> Here, AJAY KUMAR
is PCDATA. It is inside the <name>
tag, and XML will read and understand it as the value for the name.
Special
Characters must be written carefully
Some
characters have special meanings in XML, so you can’t use them directly in
PCDATA.
|
Symbol |
Use in XML PCDATA |
|
< |
< |
|
> |
> |
|
& |
& |
Example: <note>5
< 10</note> (This
means: 5 < 10, XML will understand it
correctly because < is written as <)
EXAMPLE OF PROGRAM OF XML USING PCDATA (Student
Information)
<?xml
version="1.0" encoding="UTF-8"?>
<student>
<name>AJAY KUMAR </name>
<roll>102</roll>
<course>Ph. D</course>
<college>MRS. KMPM VC
College</college>
</student>
Parsed Character Data (PCDATA)
is a data definition that originated in Standard Generalized Markup Language
(SGML), and is used also in Extensible Markup Language (XML) Document Type
Definition (DTD) to designate mixed content XML elements.
|
|
|
|||
|
|
|
|||
|
|
|
|||
|
|
|
|||
|
|
|
|||
|
|
|
|||
|
|
|
|||
|
|
|
|||
|
|
|
|||
|
|
|
|
WHAT
IS DTD (CREATING A DTD ):-
DTD :- (In hindi:- jo
bhi data transmit hora hai kishi others documents mai through the .DTD (kiyeki
. XML, . DTD and. .XSD (XML Schema Definition) only kisibhi data / information
ko carry karta hai display nahi karta hai kishi bhi browser mai) ko and
wah data ak well formed mai hai ya nahi usko legal checked karta hai because
agar .DTD ka data ham transmit kiye kishi others applications mai or wah DTD
data others application mai access ho kar waha per errors dene lagega to waha
per problems hoga esliye DTD data ko pahele check and verify kar lega ki yah data kishi others
documents mai execute ho raha hai ya nahi)
DOCUMENT
TYPE DEFINITIONS (DTD) :- The XML Document Type Declaration, commonly
known as DTD, is a way to describe XML language accurately (An application can
use a DTD to verify that XML data is valid and well formed). DTDs check
vocabulary (SPELLING) and validity of the structure of XML documents against
grammatical rules of appropriate XML language.
An XML DTD can be either
specified inside the document, or it can be kept in a separate document and
then liked separately.
A document type definition (DTD)
provides you with the means to validate XML files against a set of rules. When
you create a DTD file, you can specify rules that control the structure of any
XML files that reference the DTD file.
A DTD can contain declarations
that define elements, attributes, notations, and entities for any XML files
that reference the DTD file. It also establishes constraints for how each
element, attribute, notation, and entity can be used within any of the XML
files that reference the DTD file.
To be considered a valid XML
file, the document must be accompanied (go with
OR go along with ) by a DTD (or an XML schema), and conform to all of
the declarations in the DTD (or XML schema).
Certain XML parsers have the
ability to read DTDs and check to see if the XML file it is reading follows all
of those rules. While the parser is reading the XML file, it will check each
line to be sure that it conforms to the rules that are laid (put, arranged) out
in the DTD file. If there is a problem, the parser generates an error and
points to where the error occurs in the XML file. This kind of parser is called
a validating parser because it validates the content of the XML file against
the DTD.
Syntax
Basic syntax of a DTD is as follows −
<!DOCTYPE element DTD identifier
[
declaration1
declaration2
........
]>
IN THE ABOVE SYNTAX,
The DTD starts with <!DOCTYPE delimiter.
An element tells the parser to
parse the document from the specified root element.
DTD identifier is an identifier
for the document type definition, which may be the path to a file on the system
or URL to a file on the internet. If the DTD is pointing to external path, it
is called External Subset.
The square brackets [ ] enclose
an optional list of entity declarations called Internal Subset.
A complete example of well-formed
and valid XML document. It follows all the rules of DTD.
Save file :- employee.xml
<?xml version="1.0"?>
<!DOCTYPE employee SYSTEM "employee.dtd"> //
EXTERNAL DTD WHICH IS EMPLOYEE.DTD IS THE FILE NAME//
<employee>
<firstname>AJAY</firstname>
<lastname>PATHAK</lastname>
<email>ajay@example.com</email>
</employee>
In the above example, the DOCTYPE declaration refers to an
external DTD file. The content of the file is shown in below paragraph.
File name :- employee.dtd
<!ELEMENT employee (firstname,lastname,email)>
<!ELEMENTfirstname (#PCDATA)>
<!ELEMENTlastname (#PCDATA)>
<!ELEMENT email (#PCDATA)>
OUTPUT
DATA
TYPES IN XML
There are 2 data
types,
(1)PCDATA is parsed
character data (Full form parsed character data).
(2) CDATA is
character data, not usually parsed (full form character data).
(1)..PCDATATYPE
:- PCDATA (Parsed Character Data) is text that will be parsed by the XML
parser.
Tags inside the PCDATA will be
treated as markup and entities will be expanded.
PCDATA , It refers to the
character data within an XML element that will be parsed by the XML processor.
PCDATA can contain text, but certain characters like '<', '>', and
'&' must be escaped using predefined entities (<,>,
&) to ensure they are treated as data and not markup.
Example demonstrating the use of PCDATA in an XML document:
SAVE AS FILE:- *.XML
<?xml version="1.0"
encoding="UTF-8"?>
<bookstore>
<book category="WEB TECH">
<title>PROGRAMMING</title>
<author>AJAY KUMAR PATHAK </author>
<year>2023</year>
<description>
This is a PCDATA
example. It contains text with special characters like <, >, and
&,
which must be escaped in XML.
</description>
</book>
</bookstore>
In this example:
<title>, <author>,
<year>, and <description> are elements in the XML.
The text within the
<title>, <author>, and <year> elements represents PCDATA.
The <description> element
contains PCDATA that includes special characters like <, >, and &,
which are escaped as entities (<,>, &) to ensure proper
parsing by an XML processor.
PCDATA allows you to include
textual data within XML elements while ensuring that special characters are
properly represented and parsed as data rather than XML markup.
EXAMPLE OF PCDATA TYPE PROGRAM WITH
.DTD FOR PRACTICAL ALSO
XML FILE (example.xml):
<?xml version="1.0"
encoding="UTF-8"?>
<!DOCTYPE library SYSTEM "example.dtd">
<library>
<book>
<title>XML Basics</title>
<author>AJAY PATHAK</author>
<published_year>2023</published_year>
<description>This is a book about XML
basics.</description>
</book>
<book>
<title>Learning XPath</title>
<author>PATHAK AJAY</author>
<published_year>2022</published_year>
<description>An introduction to XPath for XML
querying.</description>
</book>
</library>
DTD File (example.dtd):
<!ELEMENT library (book+)>
<!ELEMENT book (title, author, published_year,
description)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT author (#PCDATA)>
<!ELEMENTpublished_year (#PCDATA)>
<!ELEMENT description (#PCDATA)>
Explanation:
In the XML file, there is a <library> element
containing multiple <book> elements.
Each <book> element consists of child elements like
<title>, <author>, <published_year>, and <description>.
The DTD file defines the
structure of the XML document:
<!ELEMENT library (book+)>
specifies that the library element can contain one or more book elements.
<!ELEMENT book (title, author,
published_year, description)> defines the structure of the book element
containing title, author, published_year, and description elements in that
order.
<!ELEMENT title (#PCDATA)>,
<!ELEMENT author (#PCDATA)>, <!ELEMENT published_year (#PCDATA)>,
<!ELEMENT description (#PCDATA)> declare that the child elements contain
PCDATA(Parsed Character Data), meaning they can contain text content.
This structure outlines the
hierarchy and content model for the XML document using the DTD, specifying that
certain elements must contain text data.
OUTPUT
(2). CDATA TYPE: -
CDATA stands for Character Data and is a way to include
blocks of text in XML documents without the need for escaping special
characters (such as <, >, &, etc.). CDATA sections begin with
<![CDATA[ and end with ]]> and are used to encapsulate blocks of text that
might contain characters that would otherwise be recognized as markup.
Here's an example demonstrating the use of CDATA in an XML
document:
SAVE AS FILE:- *.XML
<?xml version="1.0"
encoding="UTF-8"?>
<bookstore>
<book category="WEB TECH">
<title>PROGRAMMING</title>
<author>AJAY KUMAR PATHAK</author>
<year>2023</year>
<description><![CDATA[
The web tech is
Markup language.
This CDATA
section allows including text without worrying about escaping special
characters,
such as <, >, or &.
]]></description>
</book>
</bookstore>
In this example:
<title>, <author>, <year>, and
<description> are elements in the XML.
are treated as regular text and
do not need to be escaped.
CDATA sections are useful when
you want to include large blocks of text within an XML element without having
to escape special characters. This makes it easier to include content that
contains a lot of reserved or problematic characters without disturbing about
XML parsing or validation issues.
EXAMPLE OF PCDATA TYPE PROGRAM WITH
.DTD FOR PRACTICAL ALSO
CDATA sections begin with <![CDATA[ and end with ]]>.
XML File (example.xml):
<?xml version="1.0"
encoding="UTF-8"?>
<!DOCTYPE notes SYSTEM "example.dtd">
<notes>
<note>
<title><![CDATA[Important Note]]></title>
<content><![CDATA[
This is a
sample note with special characters like < and >.
It doesn't affect the XML parsing
because it's wrapped in a CDATA section.
Special
characters like & and ' also don't need escaping here.
]]></content>
</note>
</notes>
DTD File (example.dtd):
<!ELEMENT note (title, content)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT content (#PCDATA)>
In this example,
Explanation:
- <!ELEMENT ...>: This declares an element in DTD.
- notes: This is the name of the element being defined.
- (note*): This specifies the content model – what kind of child elements the <notes> element can contain.
MEANING:
- The
noteselement can contain zero or morenoteelements. - The
asterisk
*means "zero or more occurrences".
EXAMPLE:
Here’s how this would look in a real XML document that
follows this DTD rule:
<notes>
<note>
<to>Alice</to>
<from>Bob</from>
<message>Hello!</message>
</note>
<note>
<to>Charlie</to>
<from>David</from>
<message>Hi
there!</message>
</note>
</notes>
This is valid because:
- <notes> contains two <note> elements.
- Even if there were no <note> elements
inside <notes>, it would still be valid.
the <![CDATA[ and ]]> tags wrap the text inside
<title> and <content> elements. The DTD (example.dtd) defines the
structure of the XML document, specifying that title and content elements
contain parsed character data (#PCDATA), which can include text or CDATA
sections.
OUTPUT
CREATING A DTD FOR AN EXISTING XML
FILE.
TYPE OF DTD :
1. Internal DTD
2. External DTD
1. Internal DTD:- A DTD is referred to as an internal DTD
if elements are declared within the XML files and file must be save
file_name . xml . To refer it as internal DTD, standalone
attribute in XML declaration must be set to yes. This means, the declaration
works independent of an external source.
Syntax
Following is the syntax of internal DTD −
<!DOCTYPE root-element [element-declarations]>
where root-element is the name of root element and
element-declarations is where you declare the elements.
Example
Simple example of internal DTD( Savefile_name . xml ):-
<?xml version = "1.0" encoding =
"UTF-8" standalone = "yes" ?> // STANDALONE =
"YES" IT MEANS THIS
XML FILE IS COMPLETE BY ITSELF, AND DOESN’T NEED ANY EXTERNAL FILE LIKE A DTD
(DOCUMENT TYPE DEFINITION). //
<!DOCTYPE address [
<!ELEMENT address (name,company,phone)>
<!ELEMENT name (#PCDATA)>// PCDATA means parsed
character data //
<!ELEMENT company (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
]>
<address>
<name>ajay pathak</name>
<company>MRSKMPM VC </company>
<phone>1121133</phone>
</address>
LET US GO THROUGH THE ABOVE CODE −
Start Declaration − Begin the XML declaration with the
following statement.
<?xml version = "1.0" encoding =
"UTF-8" standalone = "yes" ?>
DTD − Immediately after the XML header, the document type
declaration follows, commonly referred to as the DOCTYPE −
<!DOCTYPE address [
The DOCTYPE declaration has an exclamation mark (!) at the
start of the element name. The DOCTYPE informs the parser that a DTD is
associated with this XML document.
DTD Body − The DOCTYPE declaration is followed by body of
the DTD, where you declare elements, attributes, entities, and notations.
<!ELEMENT address (name,company,phone)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT company (#PCDATA)>
<!ELEMENTphone_no (#PCDATA)>
Several elements are declared here that make up the
vocabulary (terminology) of the <name> document. <!ELEMENT name
(#PCDATA)> defines the element name to be of type "#PCDATA". Here
#PCDATA means parse-able text data.
End Declaration − Finally, the declaration section of the
DTD is closed using a closing bracket and a closing angle bracket (]>). This
effectively ends the definition, and thereafter, the XML document follows
immediately.
VVI : RULES FOR INTERNAL DTD
1. THE DOCUMENT TYPE DECLARATION MUST APPEAR AT THE START OF THE DOCUMENT (PRECEDED ONLY BY THE XML HEADER) − IT IS NOT PERMITTED ANYWHERE ELSE WITHIN THE DOCUMENT.
2. SIMILAR TO THE DOCTYPE DECLARATION, THE ELEMENT DECLARATIONS MUST START WITH AN EXCLAMATION MARK.
3. THE NAME IN THE DOCUMENT TYPE DECLARATION MUST MATCH THE ELEMENT TYPE OF THE ROOT ELEMENT.
EXAMPLE : (1) PROGRAM OF INTERNAL
DTD FOR PRACTICAL
Save file_name .xml
<?xml version = "1.0" encoding =
"UTF-8" standalone = "yes" ?>
<!DOCTYPE address [
<!ELEMENT address (name,company,phone)>
<!ELEMENT name (#PCDATA)> // PCDATA means parsed character data //
<!ELEMENT company (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
]>
<address>
<name>ajay pathak</name>
<company>MRSKMPM VC </company>
<phone>1121133</phone>
</address>
EXAMPLE : (2) PROGRAM OF INTERNAL
DTD FOR PRACTICAL
Save file_name .xml
<?xml version="1.0"?>
<!DOCTYPE note [
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
]>
<note>
<to>AJAY</to>
<from>PAWAN, RAJU SINGH </from>
<heading>MEETING </heading>
<body>Meeting on Next Sunday Morning at 12
Noon.</body>
</note>
OUTPUT
EXAMPLE : (3) PROGRAM OF INTERNAL
DTD FOR PRACTICAL
<?xml
version="1.0" encoding="UTF-8"?>
<!DOCTYPE
note [
<!ELEMENT note (to, from, message)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT message (#PCDATA)>
]>
<note>
<to>Ravi</to>
<from>Seema</from>
<message>Hello! How are
you?</message>
</note>
2.. External DTD : SAVE AS FILE_NAME.DTD ,
In External DTD elements are declared outside the XML file.
They are accessed by specifying the system attributes which may be either the
legal .dtd file or a valid URL. To refer it as external DTD, standalone
attribute in the XML declaration must be set as no.
This means, declaration includes information from the
external source.
Syntax
Following is the syntax for external DTD −
<!DOCTYPE root-element SYSTEM "file-name">
WHERE FILE-NAME IS THE FILE WITH . DTD EXTENSION.
Example
The following example shows external DTD usage −
<!DOCTYPE address SYSTEM "address.dtd"> //
HERE WE ARE ASSIGNED / CALLED
("address.dtd") EXTERNAL .DTD FILE WHICH IS CREATED IN OTHER
FILE//
<address>
<name>AJA PATHAK</name>
<company>MRSKMPM VC </company>
<phone>232312312</phone>
</address>
The content of the DTD file address.dtd is as shown −
<!ELEMENT address (name,company,phone)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT company (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
Explanation Types
You can refer to an external DTD by using either system
identifiers or public identifiers.
System Identifiers
A system identifier enables you to specify the location of
an external file containing DTD declarations. Syntax is as follows −
<!DOCTYPE name SYSTEM "address.dtd" [...]>
As you can see, it contains keyword SYSTEM and a URI
reference pointing to the location of the document.
EXAMPLE : (1) PROGRAM OF EXTERNAL DTD FOR PRACTICAL
<?xml version="1.0"?>
<!DOCTYPE note SYSTEM "note.dtd">
<note>
<to>AJAY KUMAR PATHAK</to>
<from>PAWAN, RAJ SINGH</from>
<heading>MEETING </heading>
<body>Meeting on NEXT SUNDAY Morning at 12
NOON.</body>
</note>
SAVE AS FILE_NAME .DTD (Note.dtd)
<!DOCTYPE note [
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
]>
THE END
THE CONCEPT OF A VALID XML DOCUMENT:-
Most XML browsers will check your document to see if
it is well formed. Some of them can also check whether it's valid. An XML
document is valid if there is a document type definition (DTD) or XML schema
associated with it, and if the document complies with that DTD or schema.
An XML document is called valid when: When , It is
well-formed AND, It follows the rules
defined in a DTD (Document Type Definition) or XML Schema (XSD) , So,
"valid" means the XML file is correct in structure and it follows a
defined set of rules.
Well-formed means:
1. Tags are properly opened and closed.,,
2. Tags are nested correctly.,,
3. There is one root element.,,
4. Attribute values are in quotes.
Example (Well-formed XML):
<student>
<name>Ravi</name>
<age>20</age>
</student>
Valid XML (with DTD) Example:
Valid XML with Internal DTD
<?xml
version="1.0" encoding="UTF-8"?>
<!DOCTYPE student [
<!ELEMENT student
(name, age)>
<!ELEMENT name
(#PCDATA)>
<!ELEMENT age
(#PCDATA)>
]>
<student>
<name>Ravi</name>
<age>20</age>
</student>
(FOR MORE , WELL FORMED PLEASE SEE THE PREVIOUS ABOVE NOTES AND
DTD – COMPONENTS :-
A DTD will basically contain
declarations of the following XML components
(1)..Element
: - XML elements can be defined as building blocks of an XML document.
Elements can behave as a container to hold text, elements, attributes, media
objects or mix of all.
A DTD element is declared with an
ELEMENT declaration. When an XML file is validated by DTD, parser initially
checks for the root element and then the child elements are validated.
Example
A simple example of XML elements
<name>
AJAY PATHAK
</name>
As we can, see we
have defined a <name> tag. There's a text between start and end tag of
<name>. Elements,
(2)..Attributes : -
Attributes are part of the XML elements. An element can have
any number of unique attributes. Attributes give more information about the XML
element or more precisely ( exactly) it
defines a property of the element. An XML attribute is always a name-value
pair.
A simple example of XML attributes
Here img is the element name whereas src is an attribute
name and flower.jpg is a value given for the attribute src.
(a) Internal entities :- If an entity is declared within a
DTD it is called as internal entity.
Syntax
Following is the syntax for internal entity declaration:-
<!ENTITYentity_name "entity_value">
In the above syntax
entity_name is the name of entity followed by its value
within the double quotes or single quote.
entity_value holds the value for the entity name.
Example of internal entities :-
<?xml version = "1.0" encoding =
"UTF-8" standalone = "yes"?>
<!DOCTYPE address [
<!ELEMENT address (#PCDATA)>
<!ENTITY name "RAJ SINGH">
<!ENTITY company "RAJSINGH.COM">
<!ENTITYphone_no "12345678">
]>
<address>
&name;
&company;
&phone_no;
</address>
In the above example, the respective entity names name,
company and phone_no are replaced by their values in the XML document. The
entity values are de-referenced by adding prefix & to the entity name.
NOTE:- SAVE THIS FILE AS SAMPLE.XML
AND OPEN IT IN ANY BROWSER, YOU WILL NOTICE THAT THE ENTITY VALUES FOR NAME,
COMPANY, PHONE_NO ARE REPLACED RESPECTIVELY.
If an entity is declared
outside a DTD it is called as external entity. You can refer to an external
Entity by
either using system identifiers or public identifiers.
Syntax
Following is the syntax for External Entity declaration −
<!ENTITY name SYSTEM "URI/URL">
In the above syntax :−
name is the name of entity.
URI/URL is the address of the external source enclosed
within the double or single quotes.
You can refer to an external DTD by either using −
(1).. System Identifiers − A system identifier enables you to specify the location of an external file containing DTD declarations.
As you can see it contains keyword SYSTEM and a URI
reference pointing to the document's location. Syntax is as follows −
(2).. Public Identifiers −
Public identifiers provide a mechanism to locate DTD resources and are written
as below −
As you can see, it begins with keyword PUBLIC, followed by a
specialized identifier. Public identifiers are used to identify an entry in a
catalog. Public identifiers can follow any format; however, a commonly used
format is called Formal Public Identifiers, or FPIs.
Example
Let us understand the external entity with the following
example −
<?xml version = "1.0" encoding =
"UTF-8" standalone = "yes"?>
<!DOCTYPE address SYSTEM "address.dtd">
<name>
RAJ SINGH
</name>
<company>
RAJSINGH.COM
</company>
<phone>
12345678
</phone>
</address>
BELOW IS THE CONTENT OF THE DTD FILE ADDRESS.DTD
<!ELEMENT name (#PCDATA)>
<!ELEMENT company (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
THE END
(1).. Built-in entities or Entities
references or pre define Entities
(2).. Character entities
(3).
General entities
(4).
Parameter entities
In general, you can use these
entity references anywhere. You can also use normal text within the XML
document, such as in element contents and attribute values.
There are five built-in entities that play in well-formed
XML, they are :-
(1). < for < (less than)
(2). > for > (greater than)
(3). & for & (ampersand)
(4). " for " (double quote)
(5). ' for ' (single quote or apostrophe)
SAVE AS FILE_NAME .XML
<?xml version="1.0"
encoding="UTF-8"?>
<document>
<title>Example of Predefined Entity
References</title>
<content>
<paragraph>
<example1> xml & html </example1> // THIS IS
WRONG TAG //
<example1> xml & html </example1> //
THIS IS CORRECT TAG //
<example2> 10 < 20 </example2> // THIS IS
WRONG TAG //
<example2> 10 < </example2> // THIS IS
CORRECT TAG //
<example3>20 >10 </example3> // THIS IS WRONG
TAG //
<example3>20 > 10</example3> // THIS IS
CORRECT TAG //
<example4>xml “ “html </example4> // THIS IS
WRONG TAG //
<example4>xml " html</example4> // THIS
IS CORRECT TAG //
<example5>xml ‘ ‘ html</example5> // THIS IS
WRONG TAG //
<example5>xml ' html</example5> // THIS
IS CORRECT TAG //
<!-- //(NOTE IF WE WANT TO WRITE < (LESS THEN) IN XML
FILE SO WE CAN NOT WRITE IT, THERE FOR IN PLACE OF <, >, AND ,WE CAN
WRITE ONLY < , > , & etc.// -- !>
<p>This is an example of using predefined
entity references in XML.</p> </paragraph>
<paragraph>
<p>XML supports special characters like
&lt;, &gt;, &amp;, &quot;, and
&apos;.</p>
</paragraph>
<special_characters>
< > & " '
</special_characters>
</content>
</document>
OUTPUT
IN ABOVE EXAMPLE:
The <paragraph> elements showcase text content using
<,>, &, ", and ' to display the
reserved characters <, >, &, ", and '.
The <special_characters> element directly displays the
reserved characters using their respective entity references.
When this XML file is parsed, the XML processor will
interpret the entity references and display the reserved characters as intended
without causing syntax errors.
(2).. Character entities : -
Character Entities are used to name some of the entities
which are symbolic representation of information i.e characters that are
difficult or impossible to type can be substituted by Character Entities.
Following example demonstrates the character entity
declaration −
<?xml version = "1.0" encoding =
"UTF-8" standalone = "yes"?>
<!DOCTYPE author[
<!ELEMENT author (#PCDATA)>
<!ENTITY writer "AJAY PATHAK">
<!ENTITY copyright "{">
]>
<author>&writer;©right;</author>
You will notice here we have used
{ as value for copyright character. Save this file as sample.xml and
open it in your browser and you will see that copyright is replaced by the
character ©.
(3). General entities : -
General entities must be declared
within the DTD before they can be used within an XML document. Instead of
representing only a single character, general entities can represent
characters, paragraphs, and even entire documents.
Syntax
To declare a general entity, use
a declaration of this general form in your DTD −
Example
Following example demonstrates the general entity
declaration −
<?xml version = "1.0"?>
<!DOCTYPE note [
<!ENTITY source-text "AJAYPATHAK">
]>
<note>
&source-text;
</note>
Whenever an XML parser encounters a reference to source-text
entity, it will supply the replacement text to the application at the point of
the reference.
The purpose of a parameter entity is to enable you to create
reusable sections of replacement text.
Syntax
Following is the syntax for parameter entity declaration −
<!ENTITY % ename "entity_value">
ü
entity_value is any character that is not an
'&', '%' or ' " '.
Example
Following example demonstrates the parameter entity
declaration. Suppose you have element declarations as below –
<!ELEMENT residence (name, street, pincode, city,
phone)>
<!ELEMENT apartment (name, street, pincode, city,
phone)>
<!ELEMENT office (name, street, pincode, city, phone)>
<!ELEMENT shop (name, street, pincode, city, phone)>
Now suppose you want to add additional eleement country,
then then you need to add it to all four declarations. Hence we can go for a
parameter entity reference. Now using parameter entity reference the above
example will be −
<!ENTITY % contact "phone">
Parameter entities are
dereferenced in the same way as a general entity reference, only with a percent
sign instead of an ampersand −
<!ELEMENT apartment (%area;, %contact;)>
<!ELEMENT office (%area;, %contact;)>
<!ELEMENT shop (%area;, %contact;)>
When the parser reads these declarations, it substitutes the
entity's replacement text for the entity reference.
END OF THE UNDERSTANDING
XML ,
UNIT 2 IS COMPLETED
No comments:
Post a Comment
PLEASE DO LEAVE YOUR COMMENTS