How's My Network? - WPI

How's My Network? - WPI

PREDICTING PERFORMANCE FOR READING NEWS ONLINE FROM WITHIN A BROWSER SANDBOX Murad Kaplan Advisor: Mark Claypool Reader: Craig Wills M.S. Thesis Presentation 2 Online News Increasingly important Internet activity. Korea, more than half of population reads news online [The OECD Report. 2009]

62% of US Internet users aged12-17 go online for news [The guardian. 10] 73% of Internet users read news online [The guardian. 10] Mobile access to Internet is on the rise, and the reading of news on the platform is likely to follow this development [Pew Internet Project. 10] Web sites must display a significant amount of content on the home page. [E. Jorden. 2010] 3 Current Limitations to Measuring Performance for

Online News Available platforms provide low-level network data, but not necessarily understandable to average users Web site performance measurement tools focus on server side, with measurements not readily mapping to user experience No research in performance measurements targeted online news before. 4 Goal

Predict performance for online news sites by: Select characteristics of news sites to be measured Select suitable methods of measuring Analyze collected data Build models based on analysis Evaluate models Provide performance from user prospective Choosing a specific news site Provide meaningful results (very good, good, bad, etc.) Predict performance with small costs Little time (< 3 seconds) Few downloads (Max 7 objects) Apply to other sites Implement in HMN

5 Outline Introduction Background Approach Evaluation Conclusion Future Work 6 Network Measurement Platforms Speedtest Limited incentives for typical users (download, upload, ping)

Not designed to inform network researchers Netalyzer [3,4] A broad range of network measurements Output not meaningful for typical users Gomez Offers monetary incentive Needs software 7 Web Characterization Has been done since almost the beginning of the World Wide Web [J. Pitkow 98] Better understand of objects types/sizes on the Web for

network performance and measurement. Provide Web designers with their Web sites performance to the end users [Web Characterization Project. 02] No Characterization for specific Web type such as News, shopping, etc. 8 Background - HMN Overcome the impediments in the existing measurement platforms Increase the incentives for users/research experts

New techniques using JavaScript and Flash from within Browser sandbox environment Applied to real world Web Applications 9 Outline Introduction Background Approach Evaluation Conclusion Future Work

10 Approach Characterize news sites and analyze Web browsers behaviors Design prediction models Set up environment Implement models and evaluate results. 11 Characterization and Analysis Characterization for News sites Choose most popular News sites [The EbizMbA. 2011]: :CNN, New York Times, LA

Times, and MSN Collect: Number of objects per page Sizes of objects Number of domains objects come from Web Browsers Behaviors Choose most popular Web browsers 3.6, and Internet Explorer 8. Analyze: [Browserscope. 2011]: Mechanism for retrieving Web pages Number of connections per hostname

Number of connections for all hostnames Chrome 14, Firefox 12 Characterization for News sites Three levels for characterization (home page, sections (sport, world, health, etc.), and articles) Home Page World

Health Politics Travel Use Pagestats [10], to crawl news page Sport Article 13 Characterization Results

Distribution of objects differ across sites Object Sizes Distribution for Home page of the Four News Sites 14 - MSN, usually 80% of objects < 5KByte - LAT and CNN, larger objects - Sections, except Sport are similar to Home 15

Number of Objects in Home Page in News Sites Similarity in number of objects in CNN and NYT 16 Page Size in for Home Page in News Sites Similar page sizes except LAT 17

Number of Objects among the levels in News Sites High number of objects doesnt mean large page size Page Size in all levels in News Sites 18 MSNBC-Home

LA-Home https://latimes.signon.trb.com/; http://b.scorecardresearch.com; 0.99% 1.48% http:// www.msnbc.msn.com; 23.60% http:// msnbcmedia.msn.com; 23.60% http://

msnbcmedia4.msn.com/; 11.18% http:// msnbcmedia2.msn.com; 11.80% Domains http://www.latimes.com; 77.83% 19 Characterization Summary Similarity but there is some

variance 20 Browsers Behaviors IE, CNN home page Fiddler [fiddler Web debugger] 21 Prediction Methods Characterization Observation Container loading.

Domains that browsers retrieves its objects from. Serial vs. Parallel downloads. Model 1. Serial Total ST Model 3. Parallel Total PT Download Container Download Container Download Average Object Size one

Download Average Object Size six times time Use Total number of objects in the page (from all domains) Model 2. Serial Dominant SD in parallel Use Total number of objects in the page (from all domains) Model 4. Parallel Dominant PD Download Container

Download Container Download Average Object Size one Download Average Object Size six times time Use Total number of objects in the dominant domain only in parallel Use Total number of objects in the dominant domain only

22 Prediction Methods Tc : time to download container To : time to download an average-size object Nt : number of total objects, Nd : number of objects in the dominant domain P : number of downloads in parallel 23 Experiment Setup UST

ABC RUE WPT HPT 1Mbit/0.256Mbit 50 msec eth1 eth0 New DELL, Win 7

Bridge, UNIX Extend to 10 Most popular News 5 Times 3 Browsers 4 Models BBC LAT CNN NYT

MSN 24 Outline Introduction Background Approach Evaluation Conclusion Future Work 25 Evaluation

A glance of News sites download times Difference in DL time across news sites Difference DL time for one site across browsers (object types) 26 Serial vs. Parallel Domain always

wins 27 Predicting User Experience Measured time differences may be of interest for network researchers Typical user may not notice the impact of an additional few seconds of page load time Provide performance predictions intended to have more relevance than time alone [Net Forecasts et al. 02] [S. Souder. High Performance Web sites 09] 28

Some predictions "perfect", others under, others over Parallel slightly better than Serial Prediction Error for News in Firefox 29 - PD, perfect predictions > 40% of the time - SD, worse, < 30% - For about 3% of the

predictions, PD is nearly 3 stars in error, compared to only 0.5% for SD Cumulative Distribution of Prediction Errors for all News Sites and Browsers 30 - IE, about 50% of predictions are perfect and about 85% have 1 star error - Firefox has 45% of predictions perfect and about 90% with 1 star error - Chrome has 30% of predictions perfect and about 90% with 1 star error

Cumulative Distribution of Prediction Errors for PD for all News Sites across Browsers 31 Using our methods to different type of Web sites For online shopping, about 65% of the predictions are perfect and no predictions are worse than 2 stars in error. 32 Outline Introduction

Background Approach Evaluation Conclusion Future Work 33 Conclusion Online news prediction techniques in HMN can provide low impediment and high incentive for researchers and typical users. Using number of objects from dominant domain is always better than using total number of objects

15% to 60% better Assuming objects download in parallel rather than serially provides generally better predictions 15% perfect predictions for online news. Our methods can be used for other Web sites 65% perfect predictions for shopping sites 39% perfect predictions for social networks 34 Future Work Extend Web characterization to different Web sites.

Develop our models to include other factors such as object types. Extend to target Multimedia in online news. 35 References [1] The OECD reports "The future of news and the Internet , Organization for Economic Cooperation and

Development, June 2009. http://www.oecd.org/document/48/0,3343,en_2649_34223_45449136_1_1_1_1,00.html [2] E. Jorden. Newspaper Website Design http://www.ejordanweb.com/index.php? option=com_content&view=article&id=62:newspaper-website-design&catid=19:news&Itemid=176 , 2010. [3] SpeedTest http://www.speedtest.net/

[4] Planetlab http://www.planet-lab.org/ [5] F. Papadopoulos and K. Psounis. Predicting the performance of Internet-like networks using scaleddown replicas. In ACM SIGMETRICS Performance Evaluation Review, Volume 35 Issue 3, December 2007 [6] C. Xing, M. Chen, and L. Yang. Predicting Available Bandwidth of Internet Path with Ultra Metric Space [7] kc claffy, Mark Crovella, Timur Friedman, Colleen Shannon, and Neil Spring. Communityoriented network measurement infrastructure (CONMI) workshop report. SIGCOMM Comput. Commun. Rev., 36(2):4148, 2006. [8] J. Pitkow. Summary of WWW Characterizations. In Computer Networks and ISDN Systems, Volume 30 Issue 1-7, April 1, 1998. [9] E. ONeill. OCLC, Online Computer Library Center, Web Characterization Project. Wcp.oclc.org, 2002 [10] http://web.cs.wpi.edu/~weizhang/docs/pagestats.xpi [11]http://www.ebizmba.com/articles/news-websites Fiddler Web Debugger - A free web debugging tool www.fiddler2.com/

Recently Viewed Presentations

  • Nefes Darliği İle Baş Edebi̇lme Yöntemleri̇

    Nefes Darliği İle Baş Edebi̇lme Yöntemleri̇

    Breathing rate Anxiety Dyspnea Yu, J Psychosom Res, 2007 * Progressive muscle relaxation * Forward leaning * Forward Leaning Barach, 1974 Sharp, 1980 Druz, 1982 Delgato, 1982 O'Neil, 1983 Reduces scalen and sternomastoid muscles activities Increases transdiaphragmatic pressure Regulatesthoracoabdominal motions...
  • Digital Logic Circuits - Kaist

    Digital Logic Circuits - Kaist

    Not used in the system production due to erasability Memory Components Classification by the Circuit Density SSI - several (less than 10) independent gates MSI - 10 to 200 gates; Perform elementary digital functions; Decoder, adder, register, parity checker, etc...
  • Why is 9th grade important for my future?

    Why is 9th grade important for my future?

    Biology. Chemistry or Physics or IPC. One Advanced Science. Advanced Science. ... [email protected] / 281-634-7674. She specializes in career exploration and assisting students through the college application process. Course Selection.
  • WHAT DOES IT MEAN TO BE WELL EDUCATED?

    WHAT DOES IT MEAN TO BE WELL EDUCATED?

    IT'S THE TEACHER Effect of Teacher Effectiveness on Student Achievement 3rd graders placed with 3 high performing teachers in a row averaged 96th percentile at end of 5th grade in math 3rd graders placed with 3 low performing teachers in...
  • Grade 10 Geography skills unit

    Grade 10 Geography skills unit

    MAP PROJECTIONS are used to transfer images from the globe to a FLAT surface. It is used for making SMALL scale maps, allowing a whole COUNTRY or CONTINENT, or even the world to fit onto one page. (plus its easier...
  • Napoleon&#x27;s France and the Haitian Revolution

    Napoleon's France and the Haitian Revolution

    Following Napoleon's defeat in Russia the European powers allied themselves against France. The European powers marched all the way into Paris causing the French to surrender. One of the terms of the surrender was that Napoleon be removed from power...
  • William Shakespeare - henry.k12.ky.us

    William Shakespeare - henry.k12.ky.us

    William Shakespeare - The basics. Thought to be born on April 23, 1564 in Stratford-upon-Avon. Died April 23, 1616. Considered to be the best writer inthe English language. Surviving works: 38 plays, 154 sonnets, 2 long narrative poems, and several...
  • Ch 4 Climate and Biomes

    Ch 4 Climate and Biomes

    Woodland / Shrubland(Chaparral) Climate - mild temperatures with a rainy and dry season. Located - 30-60o latitude near bodies of water. ... Too few nutrients and the lake cannot support a diverse food web. Too many nutrients and the lake...