asked on

Web services Project

You are a linguistic working at an e-books distributor. Your management has interest in finding out if commonly used words used in a book make the books more popular, therefore more books would be sold if the number of commonly used words in the book is higher. They believe that this fact would allow them to tailor their sales and marketing more closely to what would be purchased rather than just relying on the popularity of the author or other factors. So they want to test their theory by creating a web-services application that allows their customers to submit books to 1) check the popularity of the words used in the book; 2) rank the book as high, medium or low ranking, and 3) compare that to the books ranking in sales.

For this project you are only required to implement part 1) – popularity checking. There could be different ways to implement the project. The simplest way is to create a RPC and a Document-style Web service to process a document (simple text). A more complex way is to use a combination of WSDL and database applications both on the server and the client side. You can choose either way.
The three operations performed by the web service will be the following:
Word Count: Counting the total number of words in a document.
Word Rank: A ranking of all words appeared in a document. You may refer to the Internet that have listings of the most commonly used words, words that are not in the list will be ranked lowest (depending on your table length).
Word Repeatability: Counting all words for the number of time they are used in a document.

An example of word rank is shown as follows. This is a list of top 10 commonly-used English words. The word rank provided by your web service may or may not be the same as this.
Rank Word
1 the
2 of
3 to
4 and
5 a
6 in
7 is
8 it
9 you
10 that
You will use the test document below to test the three operations your web service will provide on a document.

“Message-oriented choreography
The Paper Trail concept is that the state of a multi agent multi-process system can be looked at, sometimes rather effectively, as a function of the documents which have been transmitted.
The process-oriented attitude to a bank-customer relationship may be "In parallel, the customer writes checks, merchants pay in checks, credit card transactions happen, all month. Then, the charges, interest are assessed and a bank statement sent from the bank to the customer". The document-, or message-oriented one is more like "Every month a bank balance lists valid transaction dated that month. A cleared incoming check in a valid transaction. A cleared outgoing check is a valid transaction. A validated credit card ebit is a valid transaction. A check is cleared if it is incoming and there is a matching transfer from the payee bank", and so on. This builds the relationships up in a bottom-up, web like way. The process-oriented attitude suggests the bank be written as a procedure in a top-down way using for example WSCI and BPL. The document-oriented attitude suggests the use of business rules systems triggered by the receipt of new information -- new documents, in this case new web services messages.
(Web service messages are of course documents just like documents sent in email. Messages are particular in that they have a particular time of transmission, and their document content does
not change. They do of course generally have identifiers, and even though they can only be accessed by sender and explicit receivers, they can still be regarded as part of the web by those parties.)
Whether the design process is a top-down process-oriented one or a bottom-up document-oriented one, the design will have to be translated into a set of agents and their responses to incoming messages. This manipulation can of course be done automatically.
A concern in all this frantic design is it evolution with time. A BPEL script sets out to be a description of a business process at a high level. The critical values which decide on conditional execution, or which correlate a particular process with a given transaction, are expressed as parts of the structure of the XML messages. This may lead to what has been called "DTD fragility". What happens which you change the DTD? The design of the message types with XML schema is the sort of thing which is difficult to get everyone in a company to agree on, and tents to change with time. There are many arbitrary choices made as to how the knowledge in the message is serialized as XML. Moving to RDF may, by removing a layer of arbitrary design, reduce that fragility and allow web service choreography to evolve with time within and outside a company.”
The mandatory fields (for storing customer and book information) that need to be added are:
Last name
First name
Address Street
Address city
Address state and zip code
When was the book purchased?
Where was the book purchased?
Book’s name
Book’s ISBN
Book’s author
Book’s edition
If you wish to separate these fields into several files (especially when you use a database system) or add more fields, please feel free to do so.
The results of the processing of the document must be sent back to the client, basically following a path as Client submit document => Web Service processing (Server side ranking) => Server send results back to Client.

Additional information
Error Handling:
There should be necessary error handlings in your implementation. You can implement the SOAP methodology for handling faults by using special-purpose messages such as env:fault.
Development kits:
You can use any of the IDEs such as Visual Studio, Netbeans, Eclipse, Java SDK or Java EE, or other toolkits that you are familiar with. Your coding may not require these development kits but sometimes they are helpful for coding and cohesion of the project.

Bob Learned

Yes, what is the single question that you need help with?

Y 123

ASKER

Hello I need help with XML schema for the web service word popularity check in books. I have to write a XML schema for the following questions,
I am not understanding how to write it.
I need the popularity check. The three operations performed by the web service will be the following:
Word Count: Counting the total number of words in a document.
Word Rank: A ranking of all words appeared in a document. You may refer to the Internet that have listings of the most commonly used words, words that are not in the list will be ranked lowest (depending on your table length).
Word Repeatability: Counting all words for the number of time they are used in a document.
An example of word rank is shown as follows. This is a list of top 10 commonly-used English words. The word rank provided by your web service may or may not be the same as this.
Rank Word
1 the
2 of
3 to
4 and
5 a
6 in
7 is
8 it
9 you
10 that

and the The mandatory fields (for storing customer and book information) that need to be added are:
Last name
First name
Address Street
Address city
Address state and zip code
When was the book purchased?
Where was the book purchased?
Book’s name
Book’s ISBN
Book’s author
Book’s edition

The results of the processing of the document must be sent back to the client, basically following a path as Client submit document => Web Service processing (Server side ranking) => Server send results back to Client.

I will appreciate all your help

Bob Learned

Well, the first thing that you need to do to create an XML schema from a function specification is to identify all the entities that you will need, such as people, books, clients, etc.

Then, you will need to find the attributes for those entities, such as last name, first name, address, etc.

Do you have any ideas about what you will need to use to generate the schema? I am a .NET developer, and my weapon of choice in this situation would be Visual Studio.

Bob Learned

Here is an XML schema overview article:

A Simple Overview of W3C XML Schema
http://www.codalogic.com/lmx/xsd-overview.html

Y 123

ASKER

You are a linguistic working at an e-books distributor. Your management has interest in finding out if commonly used words used in a book make the books more popular, therefore more books would be sold if the number of commonly used words in the book is higher. They believe that this fact would allow them to tailor their sales and marketing more closely to what would be purchased rather than just relying on the popularity of the author or other factors.
So they want to test their theory by creating a web-services application that allows their customers to submit books to 1) check the popularity of the words used in the book; 2) rank the book as high, medium or low ranking, and 3) compare that to the books ranking in sales.

For this project you are only required to implement part 1) – popularity checking. There could be different ways to implement the project. The simplest way is to create a RPC and a Document-style Web service to process a document (simple text). A more complex way is to use a combination of WSDL and database applications both on the server and the client side. You can choose either way.
The three operations performed by the web service will be the following:
Word Count: Counting the total number of words in a document.
Word Rank: A ranking of all words appeared in a document. You may refer to the Internet that have listings of the most commonly used words, words that are not in the list will be ranked lowest (depending on your table length).
Word Repeatability: Counting all words for the number of time they are used in a document.

An example of word rank is shown as follows. This is a list of top 10 commonly-used English words. The word rank provided by your web service may or may not be the same as this.
Rank Word
1 the
2 of
3 to
4 and
5 a
6 in
7 is
8 it
9 you
10 that

You will use the test document below to test the three operations your web service will provide on a document.

“Message-oriented choreography
The Paper Trail concept is that the state of a multi agent multi-process system can be looked at, sometimes rather effectively, as a function of the documents which have been transmitted.
The process-oriented attitude to a bank-customer relationship may be "In parallel, the customer writes checks, merchants pay in checks, credit card transactions happen, all month. Then, the charges, interest are assessed and a bank statement sent from the bank to the customer". The document-, or message-oriented one is more like "Every month a bank balance lists valid transaction dated that month. A cleared incoming check in a valid transaction. A cleared outgoing check is a valid transaction. A validated credit card ebit is a valid transaction. A check is cleared if it is incoming and there is a matching transfer from the payee bank", and so on. This builds the relationships up in a bottom-up, web like way. The process-oriented attitude suggests the bank be written as a procedure in a top-down way using for example WSCI and BPL. The document-oriented attitude suggests the use of business rules systems triggered by the receipt of new information -- new documents, in this case new web services messages.
(Web service messages are of course documents just like documents sent in email. Messages are particular in that they have a particular time of transmission, and their document content does
not change. They do of course generally have identifiers, and even though they can only be accessed by sender and explicit receivers, they can still be regarded as part of the web by those parties.)
Whether the design process is a top-down process-oriented one or a bottom-up document-oriented one, the design will have to be translated into a set of agents and their responses to incoming messages. This manipulation can of course be done automatically.

A concern in all this frantic design is it evolution with time. A BPEL script sets out to be a description of a business process at a high level. The critical values which decide on conditional execution, or which correlate a particular process with a given transaction, are expressed as parts of the structure of the XML messages. This may lead to what has been called "DTD fragility". What happens which you change the DTD? The design of the message types with XML schema is the sort of thing which is difficult to get everyone in a company to agree on, and tents to change with time. There are many arbitrary choices made as to how the knowledge in the message is serialized as XML. Moving to RDF may, by removing a layer of arbitrary design, reduce that fragility and allow web service choreography to evolve with time within and outside a company.”

The mandatory fields (for storing customer and book information) that need to be added are:
Last name
First name
Address Street
Address city
Address state and zip code
When was the book purchased?
Where was the book purchased?
Book’s name
Book’s ISBN
Book’s author
Book’s edition

If you wish to separate these fields into several files (especially when you use a database system) or add more fields, please feel free to do so.

The results of the processing of the document must be sent back to the client, basically following a path as Client submit document => Web Service processing (Server side ranking) => Server send results back to Client.

Bob Learned

I have already seen this, and it makes sense to me. It is my place here to guide to your own solution. If you have specific questions, about how to identify XML elements, and set the type, please ask those questions.

Y 123

ASKER

Yes. I want to know how can I identify XML elements. Previously I have done Flight reservation system XML document. Would it be similar to it?

Bob Learned

Schema XML is different plain XML, so if you don't have an understanding of a schema document, then you need to do some reading and researching.

When looking for elements, I look for words like "customer and book information". That gives me the clue that you need at least a Customer element and a Book element. Then you need to find clues to the attributes, like "Last name, First name, Address Street, ...

You need to make a decision what you will use to create the XSD document. If you make that choice, I might be able to help you create the document.

David Johnson, CD

you have two separate homework questions here. What xml schema have you come up with?

Q1: refers to a book and is at least a 3 part question.. you have to get a word count and frequency of the words of the book. this is part 1 of the 3 parts required.. do you need help in getting a word count and frequency?

This question needs an answer!

Become an EE member today

7 DAY FREE TRIAL

Members can start a 7-Day Free trial then enjoy unlimited access to the platform.

View membership options

Learn why we charge membership fees

We get it - no one likes a content blocker. Take one extra minute and find out why we block content.