Monday, March 19, 2012

Export XML to SQL 2K database

hello everyone-
I am fairly new to XML and understand SQL to XML transactions. However, I
am now faced with a mindnumbing problem. I've read and researched endlessly,
but can find nothing succint- no "here's where to start..." My company is
receiving NewsML feeds from which I need to extract particular data
(headline, dateline, body, etc.) for storage in a SQL table. Could someone
graciously point me in the right direction as to how I would get going on
this? There seem to be so many ways to do this and I've seen several msdn kb
articles, but I am still so confused. This will end up in an ASP.NET
application, using VB.NET -C# if I have to. :o)
TIA- Jenna
Hello Jenna!
I'd pass the feed text as a parameter to a stored procedure and use openxml to extract the data.
http://msdn.microsoft.com/library/de...oa-oz_5c89.asp
DECLARE @.NewsML varchar(8000)
-- NewsML sample xml from <a href="http://links.10026.com/?link=http://about.reuters.com/newsml/newsmlsample.asp">Reuters</a>
SET @.NewsML = '<NewsML Duid="MTFH19813_2002-07-03_15-01-48_LOSAPJZA1_NEWSML">
<Catalog href="http://links.10026.com/?link=http://www.reuters.com/newsml/mastercatalog.xml"/>
<NewsEnvelope>
<DateAndTime>20020703T150148+0000</DateAndTime>
<NewsService FormalName="RTR_TNS"/>
<NewsProduct FormalName="TXT"/>
<Priority FormalName="3"/>
</NewsEnvelope>
<NewsItem Duid="MTFH19813_2002-07-03_15-01-48_LOSAPJZA1_NEWSITEM">
<Identification>
<NewsIdentifier>
<ProviderId>reuters.com</ProviderId>
<DateId>20020703</DateId>
<NewsItemId>MTFH19813_2002-07-03_15-01-48_LOSAPJZA1</NewsItemId>
<RevisionId Update="N" PreviousRevision="0">1</RevisionId>
<PublicIdentifier>urn:newsml:reuters.com:20020703: MTFH19813_2002-07-03_15-01-48_LOSAPJZA1:1</PublicIdentifier>
</NewsIdentifier>
<DateLabel>2002-07-03 15:01:48 GMT (Reuters)</DateLabel>
</Identification>
<NewsManagement>
<NewsItemType FormalName="News"/>
<FirstCreated>20020703T150148+0000</FirstCreated>
<ThisRevisionCreated>20020703T150148+0000</ThisRevisionCreated>
<Status FormalName="Usable"/>
<Urgency FormalName="3"/>
</NewsManagement>
<NewsComponent EquivalentsList="no" Essential="no" Duid="MTFH19813_2002-07-03_15-01-48_LOSAPJZA1_MAIN_NC" xml:lang="en">
<TopicSet FormalName="HighImportance">
<Topic Duid="ts_1">
<TopicType FormalName="CategoryCode"/>
<FormalName Scheme="MediaCategory">OEC</FormalName>
<Description xml:lang="en">Economic news, EC, business/financial pages</Description>
<Property FormalName="WhyPresent" Value="Classifier"/>
</Topic>
</TopicSet>
<Role FormalName="Main"/>
<AdministrativeMetadata>
<FileName>2002-07-03T150148Z_01_LOSAPJZA1_RTRIDST_0_MARKETS-EURO-RATES.XML</FileName>
<Provider>
<Party FormalName="Reuters"/>
</Provider>
<Source>
<Party FormalName="Reuters"/>
</Source>
<Property FormalName="SourceFeed" Value="IDS"/>
<Property FormalName="IDSPublisher" Value="http://www.reuters.com/ids"/>
</AdministrativeMetadata>
<!--Single "Main Text" inner NewsComponent-->
<NewsComponent EquivalentsList="no" Essential="no" Duid="MTFH19813_2002-07-03_15-01-48_LOSAPJZA1_MAIN_TEXT_NC" xml:lang="en">
<Role FormalName="Main Text"/>
<NewsLines>
<HeadLine>Euro Rates at 1500 GMT - July 3</HeadLine>
<ByLine/>
<DateLine>July 3, 2002</DateLine>
<CreditLine>REUTERS</CreditLine>
<CopyrightLine>? Reuters 2002. All rights reserved. Republication or redistribution of Reuters content, including by caching, framing or similar means, is expressly prohibited without the prior written consent of Reuters. Reuters and the Reuters sphere logo are registered trademarks and trademarks of the Reuters group of companies around the world.</CopyrightLine>
<SlugLine>MARKETS-EURO-RATES</SlugLine>
<NewsLine>
<NewsLineType FormalName="Caption"/>
<NewsLineText>MARKETS-EURO-RATES:Euro Rates at 1500 GMT - July 3</NewsLineText>
</NewsLine>
</NewsLines>
<DescriptiveMetadata>
<Language FormalName="en"/>
<Genre FormalName="Table"/>
<OfInterestTo FormalName="AFA"/>
<OfInterestTo FormalName="CSA"/>
<OfInterestTo FormalName="LBY"/>
<OfInterestTo FormalName="RWSA"/>
<OfInterestTo FormalName="RWS"/>
<OfInterestTo FormalName="REULB"/>
<OfInterestTo FormalName="GNS"/>
<OfInterestTo FormalName="SXNA"/>
<TopicOccurrence Importance="High" Topic="#ts_1"/>
</DescriptiveMetadata>
<ContentItem Duid="MTFH19813_2002-07-03_15-01-48_LOSAPJZA1_MAIN1_TEXT_CI">
<MediaType FormalName="Text"/>
<Format FormalName="XHTML"/>
<Characteristics>
<Property FormalName="ContentID" Value="urn:newsml:reuters.com:20020703:MTFH19813_2 002-07-03_15-01-48_LOSAPJZA1_TXT:1"/>
<Property FormalName="ContentCreationDateAndTime" Value="20020703T150148+0000"/>
<Property FormalName="USN" Value="LOSAPJZA1"/>
<Property FormalName="Creator" Value="RTR_JANUS 2.300"/>
</Characteristics>
<DataContent>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title/>
</head>
<body>
<p> LONDON, July 3 (Reuters) - Following are the middle exchange rates for leading currencies against the euro:</p>
<p> CURRENT HIGH LOW US Dollar 0.9803 0.9863 0.9767 Japanese Yen 117.48 118.23 117.29 British Pound 0.6423 0.6436 0.6408 Swiss Franc 1.4627 1.4664 1.461 Danish Crown 7.428 7.4292 7.428 Greek Drachma - - - Norwegian Crown 7.3362 7.3745 7.3306 Swedish Crown 9.102 9.134 9.058 Australian Dollar 1.7542 1.7591 1.7434 Canadian Dollar 1.5034 1.5046 1.4921 Hong Kong Dollar 7.6464 7.6779 7.6224 Russian Rouble 30.92 31.1 30.81 Singapore Dollar 1.7335 1.7418 1.731 </p>
</body>
</html>
</DataContent>
</ContentItem>
</NewsComponent>
</NewsComponent>
</NewsItem>
</NewsML>'
DECLARE @.NewsMLPtr int
EXEC sp_xml_preparedocument @.NewsMLPtr OUTPUT, @.NewsML
SELECT *
FROM OPENXML(@.NewsMLPtr, '/NewsML/NewsItem')
WITH (
Duid varchar(100) '@.Duid',
RevisionId tinyint 'Identification/NewsIdentifier/RevisionId',
FirstCreated varchar(100) 'NewsManagement/FirstCreated',
Topic varchar(100) 'NewsComponent/TopicSet/Topic/Description'
)
EXEC sp_xml_removedocument @.NewsMLPtr
|||Samu thanks so much. I'm going to give this a shot!
"Samu Lang" wrote:

> Hello Jenna!
> I'd pass the feed text as a parameter to a stored procedure and use openxml to extract the data.
> http://msdn.microsoft.com/library/de...oa-oz_5c89.asp
>
> DECLARE @.NewsML varchar(8000)
>
>
> -- NewsML sample xml from <a href="http://links.10026.com/?link=http://about.reuters.com/newsml/newsmlsample.asp">Reuters</a>
> SET @.NewsML = '<NewsML Duid="MTFH19813_2002-07-03_15-01-48_LOSAPJZA1_NEWSML">
> <Catalog href="http://links.10026.com/?link=http://www.reuters.com/newsml/mastercatalog.xml"/>
> <NewsEnvelope>
> <DateAndTime>20020703T150148+0000</DateAndTime>
> <NewsService FormalName="RTR_TNS"/>
> <NewsProduct FormalName="TXT"/>
> <Priority FormalName="3"/>
> </NewsEnvelope>
> <NewsItem Duid="MTFH19813_2002-07-03_15-01-48_LOSAPJZA1_NEWSITEM">
> <Identification>
> <NewsIdentifier>
> <ProviderId>reuters.com</ProviderId>
> <DateId>20020703</DateId>
> <NewsItemId>MTFH19813_2002-07-03_15-01-48_LOSAPJZA1</NewsItemId>
> <RevisionId Update="N" PreviousRevision="0">1</RevisionId>
> <PublicIdentifier>urn:newsml:reuters.com:20020703: MTFH19813_2002-07-03_15-01-48_LOSAPJZA1:1</PublicIdentifier>
> </NewsIdentifier>
> <DateLabel>2002-07-03 15:01:48 GMT (Reuters)</DateLabel>
> </Identification>
> <NewsManagement>
> <NewsItemType FormalName="News"/>
> <FirstCreated>20020703T150148+0000</FirstCreated>
> <ThisRevisionCreated>20020703T150148+0000</ThisRevisionCreated>
> <Status FormalName="Usable"/>
> <Urgency FormalName="3"/>
> </NewsManagement>
> <NewsComponent EquivalentsList="no" Essential="no" Duid="MTFH19813_2002-07-03_15-01-48_LOSAPJZA1_MAIN_NC" xml:lang="en">
> <TopicSet FormalName="HighImportance">
> <Topic Duid="ts_1">
> <TopicType FormalName="CategoryCode"/>
> <FormalName Scheme="MediaCategory">OEC</FormalName>
> <Description xml:lang="en">Economic news, EC, business/financial pages</Description>
> <Property FormalName="WhyPresent" Value="Classifier"/>
> </Topic>
> </TopicSet>
> <Role FormalName="Main"/>
> <AdministrativeMetadata>
> <FileName>2002-07-03T150148Z_01_LOSAPJZA1_RTRIDST_0_MARKETS-EURO-RATES.XML</FileName>
> <Provider>
> <Party FormalName="Reuters"/>
> </Provider>
> <Source>
> <Party FormalName="Reuters"/>
> </Source>
> <Property FormalName="SourceFeed" Value="IDS"/>
> <Property FormalName="IDSPublisher" Value="http://www.reuters.com/ids"/>
> </AdministrativeMetadata>
> <!--Single "Main Text" inner NewsComponent-->
> <NewsComponent EquivalentsList="no" Essential="no" Duid="MTFH19813_2002-07-03_15-01-48_LOSAPJZA1_MAIN_TEXT_NC" xml:lang="en">
> <Role FormalName="Main Text"/>
> <NewsLines>
> <HeadLine>Euro Rates at 1500 GMT - July 3</HeadLine>
> <ByLine/>
> <DateLine>July 3, 2002</DateLine>
> <CreditLine>REUTERS</CreditLine>
> <CopyrightLine>? Reuters 2002. All rights reserved. Republication or redistribution of Reuters content, including by caching, framing or similar means, is expressly prohibited without the prior written consent of Reuters. Reuters and the Reuters s
phere logo are registered trademarks and trademarks of the Reuters group of companies around the world.</CopyrightLine>
> <SlugLine>MARKETS-EURO-RATES</SlugLine>
> <NewsLine>
> <NewsLineType FormalName="Caption"/>
> <NewsLineText>MARKETS-EURO-RATES:Euro Rates at 1500 GMT - July 3</NewsLineText>
> </NewsLine>
> </NewsLines>
> <DescriptiveMetadata>
> <Language FormalName="en"/>
> <Genre FormalName="Table"/>
> <OfInterestTo FormalName="AFA"/>
> <OfInterestTo FormalName="CSA"/>
> <OfInterestTo FormalName="LBY"/>
> <OfInterestTo FormalName="RWSA"/>
> <OfInterestTo FormalName="RWS"/>
> <OfInterestTo FormalName="REULB"/>
> <OfInterestTo FormalName="GNS"/>
> <OfInterestTo FormalName="SXNA"/>
> <TopicOccurrence Importance="High" Topic="#ts_1"/>
> </DescriptiveMetadata>
> <ContentItem Duid="MTFH19813_2002-07-03_15-01-48_LOSAPJZA1_MAIN1_TEXT_CI">
> <MediaType FormalName="Text"/>
> <Format FormalName="XHTML"/>
> <Characteristics>
> <Property FormalName="ContentID" Value="urn:newsml:reuters.com:20020703:MTFH19813_2 002-07-03_15-01-48_LOSAPJZA1_TXT:1"/>
> <Property FormalName="ContentCreationDateAndTime" Value="20020703T150148+0000"/>
> <Property FormalName="USN" Value="LOSAPJZA1"/>
> <Property FormalName="Creator" Value="RTR_JANUS 2.300"/>
> </Characteristics>
> <DataContent>
> <html xmlns="http://www.w3.org/1999/xhtml">
> <head>
> <title/>
> </head>
> <body>
> <p> LONDON, July 3 (Reuters) - Following are the middle exchange rates for leading currencies against the euro:</p>
> <p> CURRENT HIGH LOW US Dollar 0.9803 0.9863 0.9767 Japanese Yen 117.48 118.23 117.29 British Pound 0.6423 0.6436 0.6408 Swiss Franc
1.4627 1.4664 1.461 Danish Crown 7.428 7.4292 7.428 Greek Drachma - - - Norwegian Crown 7.3362 7.3745 7.3306 Swedish Crown 9.102 9.13
4 9.058 Australian Dollar 1.7542 1.7591 1.7434 Canadian Dollar 1.5034 1.5046 1.4921 Hong Kong Dollar 7.6464 7.6779 7.6224 Russian Rouble 30.92 31.1 30.81 Singapore Doll
ar 1.7335 1.7418 1.731 </p>
> </body>
> </html>
> </DataContent>
> </ContentItem>
> </NewsComponent>
> </NewsComponent>
> </NewsItem>
> </NewsML>'
>
>
> DECLARE @.NewsMLPtr int
> EXEC sp_xml_preparedocument @.NewsMLPtr OUTPUT, @.NewsML
> SELECT *
> FROM OPENXML(@.NewsMLPtr, '/NewsML/NewsItem')
> WITH (
> Duid varchar(100) '@.Duid',
> RevisionId tinyint 'Identification/NewsIdentifier/RevisionId',
> FirstCreated varchar(100) 'NewsManagement/FirstCreated',
> Topic varchar(100) 'NewsComponent/TopicSet/Topic/Description'
> )
> EXEC sp_xml_removedocument @.NewsMLPtr
>

No comments:

Post a Comment