Programming Portal: February 2009

XML Document Anatomy part two

This post is in continuation with XML Document Anatomy part one and going through the will give you more comfort in understanding the present topic.

The Root Element (Lines 2 through 23)

Each XML document must have only one root element, and all the other elements must be completely enclosed in that element. Line 2 identifies the start of the

&lthome.page>

element (the start tag), and line 23 identifies the end of the element (the end tag).

Note that unlike HTML, in which a

&ltP>

tag might often be used as a sort of formatting instruction to insert a blank line between paragraphs of text, in XML an element normally consists of three things: a start tag, content (either text or other elements), and an end tag.

An XML element doesn’t always have content. Empty elements, such as the IMG element in HTML that simply points to an external graphics file through its SRC attribute, obviously have no content. An empty element might have an end tag, but it can have a special form of start tag that allows an explicit end tag to be omitted.

the name you use in the element start tag must exactly match the name you use in the end tag. If you want to use an odd combination of cases to increase the legibility of long names .

XML is case sensitive, recognizing the difference between uppercase letters (A–Z) and lowercase letters (a–z). In applications that aren’t case sensitive, mixed-case characters are usually converted—folded into one case or the other. The ASCII character set usually folds to uppercase characters. Unicode usually folds to lowercase characters. XML has to account for this, and for the fact that it might have to deal with languages in which the case folding is uncertain. Therefore, XML defaults to lowercase (and the XML declaration also has to be in lowercase).

An Empty Element (Line 13)

Empty elements are a special case in XML. In SGML and HTML, it is obvious from the DTD’s definition of an empty element that it is empty and has no comment. XML, in keeping with its developers’ design goals, requires you to be much more explicit. Indeed, you might not use a DTD at all, so it could be hard to decide whether an element is or should be empty. Therefore, empty elements have to be very clearly identified as such. To do so, there is a special empty tag close delimiter, />, as in the following:

&ltempty_element/>

To maintain a certain degree of backward-compatibility with SGML (until such time as the SGML standard is updated to allow the use of empty-tag close delimiters), and to make the conversion of existing SGML and HTML code into XML a little easier (a process called normalization, which adds end tags to all elements and is supported by a lot of SGML tools), you can use an end tag instead of the special empty tag close delimiter. The element declaration

blockquote>

&ltgraphic source=”file.gif”/>



is therefore interchangeable with



&ltgraphic source=”file.gif”></graphic>

Attributes (Lines 7 and 22)

Element tags can include one or more optional or mandatory attributes that give further information about the elements they delimit.

Attributes can only be specified in the element start tag. The syntax for specifying an attribute is

<element.type.name attribute.name=”attribute.value”>

If elements were nouns, attributes would be adjectives. We could therefore say

&ltfruit taste=”sharp”>

or even

&ltproblem size=”huge” cause=”unknown” solution=”run.away”>

An attribute can only be specified in an element start tag.

In direct contrast to SGML and HTML, in which multiple declarations are considered to be fatal errors, XML deals with multiple declarations of attributes in a unique manner. If an element appears once with one set of attributes and then appears again with a different set of attributes, the two sets of attributes are merged. The first declaration of an attribute for a particular element is the only one that counts, and any other declarations are ignored. The XML processor might warn you about the appearance of multiple declarations, but it is not required to do so and processing can continue as normal.

Xml introduction

TESTING CONSTRAINTS PART TWO

LIFE CYCLE TESTING

TEST METRICS

Independent Software Testing

Test Process

Testing verification and validation

Functional and structural testing

Static and dynamic testing

V model testing

Eleven steps of V model testing

Structural testing

Execution testing technique

Recovery Testing technique

Operation testing technique

Compliance software testing technique

Security testing technique

XML Document Anatomy

XML’s rules for distinguishing between markup and content are :

1.The start of markup is identified by either the less-than symbol (<) or the ampersand character (&).

2. Three other characters are also treated as markup characters: the greater-than symbol (>), the apostrophe or single quote, (‘), and the (double) quotation mark (“).

3.If you want to use any of the preceding special characters as normal characters, you must “escape” them by using the general entities that represent them.

To escape a character means to conceal it from a subsequent software package or process. It is often used in computing terms to refer to prefixing certain characters in programming languages with a special character string to prevent them from being interpreted as special characters.

Originally the ESC (escape) character string was used to prefix commands sent to the printer itself to control such things as the font or page size and distinguish the command strings from printable characters.

4.Everything that is not markup is content (character data).

The following code shows the XML code for a Web home page. This is a very simple example, but it contains all the important parts that you will find in nearly all XML documents.


1:  <?xml version=”1.0”?>

2:   &lthome.page>

3:     &lthead>

4:       &lttitle>

5:         My Home Page

6:       </title>

7:       &ltbanner source=”topbanner.gif”/>

8:     </head>

9:     &ltbody>

10:       &ltmain.title>

11:         Welcome to My Home Page

12:      </main.title>

13:      &ltrule/>

14:      &lttext>

15:        &ltpara>

16:          Sorry, this home page is still

17:          under construction. Please come

18:          back soon!

19:        </para>

20:      </text>

21:    </body>

22:    &ltfooter source=”foot.gif”/>

23:  </home.page>

XML Introduction

Problems with HTML :

1.HTML has syntactic checking and Validation constraints:

There are formal definitions of the structure of HTML documents. HTML is an SGML application and there is a document type definition (DTD) for every version of HTML. Web browsers are designed to accept almost anything that looks even slightly like HTML .The only tag that is compulsory in an HTML document is the TITLE tag; and this is one of the least common tags there is.

2. HTML content awareness problems:

Searching the Web is complicated by the fact that HTML doesn’t give you a way to describe the information content i.e the semantics of documents. In XML you can use any tags you like (such as &ltNAME> instead of &ltH3>), but using attributes in tags (such as &ltH3 CLASS=“name”>) can embed just as much semantic information as custom tags can.

Without any agreement on tag names, the value of custom tags becomes a bit doubtful. To worsen matters, the same tag name in one context can mean something completely different in another. Furthermore, there are the complications of foreign languages—seeing &ltinkoopprijs> isn’t going to help very much if you don’t know that it’s Dutch for “purchase price.”

HTML is not object-oriented:

Modern programmers have been making a long and difficult transition to object-oriented techniques. They want to leverage these skills and have such things as inheritance, and HTML has done very little to accommodate them.

HTML lacks a robust linking mechanism:

If you’ve spent a few hours on the Web, you’ve probably encountered at least one broken link. HTML’s links are one-to-one, with the linking hard coded in the source HTML files. If the location of one target file changes, a Webmaster may have to update dozens or even hundreds of other pages.

HTML is not reusable:

Depending on how well written they are, HTML pages and fragments of HTML code can be extremely difficult to reuse because they are so specifically tailored to their place in the web of associated pages.

The Standard Generalized Markup Language (SGML) from which XML is derived, is useful to make data storage independent of any one software package or software vendor. SGML is a meta language, or a language for describing markup languages. HTML is one such markup language and is therefore called an SGML application. In XML, these applications are often called markup languages—such as the hand-held device markup language (HDML) and the FAQ markup language (QML).

But SGML is just too expensive and complicated for Web use on a large scale. Using SGML requires too much of an investment in time, tools, and training.

XML uses the features of SGML that it needs and tries to incorporate the lessons learned from HTML.

Advantages of XML :

1.XML can be used with existing Web protocols and mechanisms and it does not impose any additional requirements. XML has been developed with the Web in mind—features of SGML that were too difficult to use on the Web were left out, and features that are needed for Web use either have been added or are inherited from applications that already work.

2.XML supports a wide variety of applications. It is difficult to support a lot of applications with just HTML; hence, the growth of scripting languages. HTML is simply too specific. XML adopts the generic nature of SGML, but adds flexibility to make it truly extensible.

3. It is easy to write programs that process XML documents. One of the major strengths of HTML is that it’s easy for even a non-programmer to throw together a few lines of scripting code that enable you to do basic processing . HTML even includes some features of its own that enable you to carry out some basic processing .

4.XML documents are reasonably clear to the any one.A valid XML document

Describes the structural rules that the markup attempts to follow
Lists any external resources (external entities) that are part of the document
Declares any internal resources (internal entities) that are used within the document
Lists the types of non-XML resources (notations) used and identifies any helper applications that might be needed
Lists any non-XML resources (binaries) that are used within the document and identifies any helper applications that might be needed

5.XML documents are easy to create. HTML is almost famous for its ease of use, and XML capitalizes on this strength.

Other Programming Courses :

Security testing and functional testing
TESTING CONSTRAINTS PART TWO

LIFE CYCLE TESTING

TEST METRICS

Independent Software Testing

Test Process

Testing verification and validation

Functional and structural testing

Static and dynamic testing

V model testing

Eleven steps of V model testing

Structural testing

Execution testing technique

Recovery Testing technique

Operation testing technique

Compliance software testing technique

Security testing technique

Security Testing and Functional Testing

It’s not the primary job of a tester to find all the bugs in a product. Unless you have an extremely small product that runs on a very limited system, you won’t be able to find all the bugs unless you don’t plan on releasing the product for a very long time.

The primary job of a tester is not to get all bugs fixed either. There are always bugs that will remain unfixed as a conscious decision. Testers also do not generally make decisions on which bugs get fixed and which ones are deferred to a later date, or even those that there are no plans to fix.

The primary job of a tester is also not to decide when to ship a product. Although you can relay the state of the product to the company, the decision to release the product is typically made by the company or a team within the product group.

Functional testing is testing that is performed on behalf of a legitimate user of the product who is attempting to use it in the way it was intended to be used and for its intended purpose. This is who the functional tester is really the advocate for.The majority of functional testing is done from the viewpoint of a customer.

Testing from only customer viewpoint will cause you to bypass a large percentage of security tests. Most security vulnerabilities, although they have a chance of being discovered by the intended customers, are unlikely to be exploited by them.

The customer may call technical support to report the bug or maybe just grumble about it to friends or acquaintances. It’s unlikely that many of the intended customers will even recognize that bug as more than a nuisance or sign of poor quality, let alone correctly see it as a security risk.

The attention of functional testing is much more focused on how to enable the customers to perform their tasks in the easiest and most convenient way possible while providing enough checks and safety measures so that they can’t cause inadvertent harm too easily. It’s a sort of
“protect them from themselves” mentality.

If any security testing is done,it tends to focus on things such as permissions and privileges but, again,only based around the assumption that the customer is using something like the login functionality as intended.

In essence, because you are performing tests on behalf of a customer,you are trusting that all people using the software you are testing are customers and not merely consumers.

Customers are the people or organizations that your software is intentionally written to solve a problem or problems for. They have been the main focus throughout the entire development cycle, from the gathering of requirements through the implementation, and that then provides the basis for functional testing.

Customers are the people or organizations that your software is intentionally written to solve a problem or problems for. They have been the main focus throughout the entire development cycle, from the gathering of requirements through the implementation, and that then provides the basis for functional testing.

Consumers, on the other hand, are those people or organizations that might use your software in a way it was or was not intended and who are not included in your customers. Sometimes your product’s consumer base grows because your product is able to perform some task as part of its normal repertoire, and that task is all that the consumer wishes to accomplish.

Sometimes it is because your product interfaces with some other software or hardware, and the consumer wants to use that ability to interface to their own advantage or because they think it may be exploitable.

Related Posts

Software security vocabulary
TESTING CONSTRAINTS PART TWO

LIFE CYCLE TESTING

TEST METRICS

Independent Software Testing

Test Process

Testing verification and validation

Functional and structural testing

Static and dynamic testing

V model testing

Eleven steps of V model testing

Structural testing

Execution testing technique

Recovery Testing technique

Operation testing technique

Compliance software testing technique

Security testing technique

Software Security Vocabulary

Access Control List (ACL): A data structure or list that is maintained to track what users or groups have permissions to perform what actions.This is a Windows term.

Attack: A particular instance of an attempted introduction of one or more exploits to a system.

Attacker: Someone who is trying to bypass the security of one or more pieces of software to carry out his or her own agenda.

Back Door: A piece of malicious software that is installed and left running to provide a way for an attacker to regain system access at a later time.

Cracker: Someone who “cracks” through software security, particularly licensing and copy protection. It’s thought to have its roots in “safe cracker.” This term isn’t often used, in part because it’s more narrowly focused and in part because it’s just not as widely known, and the differentiation between a hacker and a cracker is not clear.

Cracking: The act of circumventing the copy protection, licensing, or registration functionality of software.

Daemon: A piece of software running in the background, usually as a process. Sometimes used interchangeably with “demon.”

Denial of Service (DoS): Where legitimate users are prevented from accessing services or resources they would normally be able to access.

Distributed Denial of Service (DDoS): Where legitimate users are prevented from accessing services or resources by a coordinated attack from multiple sources.

Escalation of Privilege: When attackers illegitimately gain more functionality or access than they are authorized to have.

Ethical Hacker: One that performs penetration tests. Sometimes ethical hackers are also called “white hats.”

Exploit: A code, technique, or program that takes advantage of a vulnerability to access an asset.

Firewall: An application or hardware appliance designed to diminish the chances of an attack by limiting specific types of information that can pass into or out of a system or network. It’s a piece of perimeter security.

Hacker: Someone who “hacks” together programs, i.e., writes them in a particularly haphazard or unorganized manner. This wasn’t originally a term that was specific to attackers, but in the last few years it has become an often-used synonym for attackers, especially in the press.

Hijacking: A situation when an attacker takes over control of one side of a two-sided conversation or connection.

Hub: A networking device that repeats the network packets on the physical network layer among many devices.

Information Disclosure: A situation when an attacker is able to access information he or she shouldn’t be able to.

Intrusion Detection System: An application that monitors a system or network and reports if it recognizes that the signs of an attack are present.

Leetspeek: The stereotypical sign of a script kiddie where text is written with numbers substituted for letters. The name comes from “elite.” For example, “leet” is often written as “1337” or “l33t.”

Media Access Control (MAC) Address: Also called the Physical Address, it is physically embedded in every network interface card (NIC) during the manufacturing process. MAC addresses are often treated as unique, although that is not actually guaranteed.

OSI Network Model/OSI Seven Layer Model: The Open Systems Interconnection Reference Model. This is commonly used to explain at what point certain processes are taking place and how information travels.

Personally Identifiable Information (PII): Information that is private to the user or machine. Disclosing PII is a violation of user privacy and can be a part of identity theft problems.

Phishing: Social engineering on a large scale, usually to obtain things like login information, credit card numbers, etc.

Protocal Stack: A system that implements protocol behavior based on a series of the OSI Network Model.

Reverse Engineering: The act of wholly or partially recreating the algorithms or designs used in software. This is usually done without sourcecode access.

Rootkit or Root Kit: A set of tools and scripts that an attacker installs after successfully compromising a system. These are designed to automate additional tasks including installing additional programs like key loggers, remote administration tools, packet sniffers, backdoors, etc. Kernel Rootkits are rootkits that hide themselves within the operating system’s kernel, making them a lot more difficult to detect.

Router:A hardware device that routes traffic between two networks. It can also disguise the traffic from the network behind it to make it appear as if all traffic comes from a single system.

Script kiddie: The somewhat derogatory term for an attacker who primarily downloads and uses exploit code designed and written by others. “Script kiddie” tends to be used to signify a Copy-cat type of attacker that is not particularly skilled or creative on his or her own. A script kiddie is also considered to be young, cocky, and brash.

Social Engineering: The process of tricking or convincing a user into volunteering information the hacker can later use. This is often focused on things that are either finance related or material for identity theft.

Spoofing: Impersonating someone or something else — such as another user or machine — in order to trick software security checks or users.

Switch: A hardware device similar to a hub but which knows the hardware (MAC) addresses of each machine connected to it. This is so it can transmit packets only to the individual machine it is addressed to.

Threat: A possible path to illegitimate access of an asset.

Trojan Horse: A piece of malicious software designed to deceive the victims by appearing to be a benign program that they may wish to use and thus are willing to download or install.

Virus: A piece of malicious software that is capable of spreading itself, typically as part of a piece of software or a file that is shared between users.

Vulnerability: A bug in the software that would allow an attacker to make use of a threat to illegitimately access an asset. All vulnerabilities are threats, but only unmitigated threats are vulnerabilities.

Zero-Day Exploit: A vulnerability that is exploited immediately after its discovery, often before the software company or the security community is aware of the vulnerability.

See proof techniques here

TESTING CONSTRAINTS PART TWO

LIFE CYCLE TESTING

TEST METRICS

Independent Software Testing

Test Process

Testing verification and validation

Functional and structural testing

Static and dynamic testing

V model testing

Eleven steps of V model testing

Structural testing

Execution testing technique

Recovery Testing technique

Operation testing technique

Compliance software testing technique

Security testing technique

Proof Techniques in Testing and Stimulation

There are two approaches to proof of correctness: formal proof and informal proof. A formal proof consists of developing a mathematical logic consisting of axioms and inference rules and defining a proof either to be a proof tree in the natural deduction style or to be a finite sequence of axioms and inference rules.

Informal proof techniques follow the logical reasoning behind the formal proof techniques but without the formal logical system.

Simulation is used in real-time systems development where the "real-world" interface is critical and integration with the system hardware is central to the total design. In many nonreal- time applications, simulation is a cost effective verification and test-data generation technique.

To use simulation as a verification tool several models must be developed. Verification is performed by determining if the model of the software behaves as expected on models of the computational and external environments using simulation. This technique also is a powerful way of deriving test data. Inputs are applied to the simulated model and the results recorded for later application to the actual code.

The data sets derived cause errors to be isolated and located as well as detected during the testing phase of the construction and integration stages.

To develop a model of the software for a particular stage in the development life cycle a formal representation compatible with the simulation system is developed. This consists of the formal requirement specification, the design specification, or separate model of the program behavior. If a different model is used, then the developer will need to demonstrate and verify that the model is a complete, consistent, and accurate representation of the software at the stage of development being verified.

The next steps are to develop a model of the computational environment in which the system will operate, a model of the hardware on which the system will be implemented, and a model of the external demands on the total system.

These models can be largely derived from the requirements, with statistical representations developed for the external demand and the environmental interactions. The software behavior is then simulated with these models to determine if it is satisfactory.

Simulating the system at the early development stages is the only means of determining the system behavior in response to the eventual implementation environment. At the construction stage, since the code is sometimes developed on a host machine quite different from the target machine, the code may be run on a simulation of the target machine under interpretive control.

Simulation also plays a useful role in determining the performance of algorithms.

Related Posts

Walk throgh's and inspections in software testing
TESTING CONSTRAINTS PART TWO

LIFE CYCLE TESTING

TEST METRICS

Independent Software Testing

Test Process

Testing verification and validation

Functional and structural testing

Static and dynamic testing

V model testing

Eleven steps of V model testing

Structural testing

Execution testing technique

Recovery Testing technique

Operation testing technique

Compliance software testing technique

Security testing technique

Top Tabs