Random Posts

Get This Gadget For Blogger :- WidgetWonderland

    WebScrapperJS - Get Content/HTML of any website without being blocked by CORS even using JavaScript by WhollyAPI

    WebScrapperJS

    WebScrapperJS - Get Content/HTML of any website without being blocked by CORS even using JavaScript by WhollyAPI



    Website :- https://sh20raj.github.io/WebScrapperJS/

    GitHub | Repl.it | Dev.to Article


    Grab the CDN or Download the JavaScript File

    <script src="https://cdn.jsdelivr.net/gh/SH20RAJ/WebScrapperJS/WebScrapper.min.js" ></script>
    
    Enter fullscreen mode Exit fullscreen mode

    To Get HTML/Text Content of Any Website WebScrapper.gethtml() or WebScrapper.get()

    var url = 'https://google.com/';
    var html = WebScrapper.gethtml(url);//html of the url will be stored in this variable
    console.log(html);
    
    Enter fullscreen mode Exit fullscreen mode

    WebScrapper.gethtml() or WebScrapper.get() both are similar.


    Intialise own WebScrapper with URL new scrapper()

    let MyWebScrapper = new scrapper('https://example.com/');
    //You can now directly call gethtml() instead of passing a url into it.
    
    console.log(MyWebScrapper.gethtml()); //Grab https://example.com/ and print on console
    
    
    Enter fullscreen mode Exit fullscreen mode

    Still you can Use new created scrapper MyWebScrapper for grabbing new URLs. Like

    let MyWebScrapper = new scrapper('https://example.com/');
    //You can now directly call gethtml() instead of passing a url into it.
    
    console.log(MyWebScrapper.gethtml()); //Grab https://example.com/ and print on console
    
    console.log(MyWebScrapper.gethtml('https://example.com/')); //Grab https://youtube.com/ and print on console
    
    
    Enter fullscreen mode Exit fullscreen mode

    You can also fetch JSON Using WebScrapperJS

    var json = WebScrapper.getjson('https://jsonplaceholder.typicode.com/todos/1');//Return result direct in json format
    console.log(json);
    
    Enter fullscreen mode Exit fullscreen mode

    Try This


    Getting Result more Faster

    Use the Below codes/methods only if the origin or feching URL is not blocked by CORS Like this

    cors preview

    if your origin is not blocking you then you must use the below fetch() code instead of gethtml() directly.
    because it returns the results faster without using API.It will directly fetch origin using AJAX.

    Use WebScrapper.fetch() to get the html/text

    We will use this url https://webscrapperjs.sh20raj.repl.co/ because it is not blocked.

    var html = WebScrapper.fetch('https://webscrapperjs.sh20raj.repl.co/');//This will be return the HTML/Text inside the webpage
    console.log(html);
    
    Enter fullscreen mode Exit fullscreen mode

    Try this

    Use WebScrapper.fetchjson() to get the Parsed JSON

    var json = WebScrapper.fetchjson('https://webscrapperjs.sh20raj.repl.co/sample.json');//This will be return the JSON inside the webpage. 
    console.log(json);
    
    Enter fullscreen mode Exit fullscreen mode

    Try this


    Try this on Codepen

    Sample Code | Codepen :- https://codepen.io/SH20RAJ/pen/VwrwjXJ?editors=1001

    <div id="scrappedcontent"></div>
    
    <script src="https://cdn.jsdelivr.net/gh/SH20RAJ/WebScrapperJS/WebScrapper.min.js" ></script> 
    <script>
      let MyWebScrapper = new scrapper('https://google.com/');
    //You can now directly call gethtml() instead of passing a url into it.
    
    console.log(MyWebScrapper.gethtml()); //Grab https://example.com/ and print on console
    var html = MyWebScrapper.gethtml('https://example.com/');
    
    console.log(html); //Grab https://youtube.com/ and print on console
    
    document.getElementById('scrappedcontent').innerHTML = html;
    </script>
    
    Enter fullscreen mode Exit fullscreen mode

    See Results Here

    Comments