[공유] Peerlyst - Some useful Open Source Intelligence (OSINT) tools by David Dunmore
해당 포스트는 Peerlyst - Some useful Open Source Intelligence (OSINT) tools by David Dunmore 포스트에 있는 OSINT 도구들에 대한 내용만 발췌한 포스트입니다.
Maltego
The aim ofMaltegois to analyze real world relationships between people, web pages and sites, groups,domains,network infrastructure, SocialNetworks, and indeed anything else that is discoverable on the internet. Maltego can then present these results in a variety of graphical formats.
Here’s Maltego’sUIfrom my installation (On PCLinuxOS using the rpm packaged installation file), showing the default transforms, and a couple of useful free (As in free Beer) that I’ve found to be useful.
I wrote a short series of two posts about Maltego’s free Community edition a while ago called ‘How to use Maltego’Part 1andPart 2.
Shodan
On theHome page,Shodandescribes itself as ‘Thesearch enginefor theInternetOf Things’. It not a free Open Sourceresource, rather Shodan has severalsubscription plans, which are on a monthly rolling basis, so you can subscribe for just one month to evaluate its usefulness. The main difference between them is the number ofIP addressesper month you can access.
Shodan also has anAPI, use of which is included with all the subscription options, but you have to register separately to access the API. Thedocumentationfor Shodan’sREST APIis availablehere.
NOTE: All the API methods are limited to 1 request per second. This may or may not be a limitation you can live with. Fordevelopmentpurposes it’s unlikely to be an issue, but I can see this being a bottleneck in some investigative scenarios.
Despite the overtly nerdy, techy name, there’s really nothing mysterious going on here. Simply put,GoogleDorksare just smarter / more advanced ways to search using Google for more specific results. Once you’re familiar with some of thesetechniques, you will (genuinely) wonder how you searched previously.
“Double quotes” - (“”) search specifically for the exact words inside the double quotes
Being a British Railway nerd, I searched for “Bulleid Pacific Clan Line” which returned 102 results specific to that particular locomotive which was built in the 1940s for British Railways Southern Region.
Camera:£350 This one does not give exactly the result you might expect. It does return one result for cameras under £350, but also several results for cameras that feature the number 350 in their names. Under the default ‘All’ search, search using the ‘shopping’ tab does indeed return a large number of results for cameras under £350, including someIPcameras andCCTVones as well.
To be a bit more flexible, try camera £100..£300 for cameras in that price range. Again for best results, choose the ‘shopping’ tab.
You can use AND. OR to combine results. e.g. camera AND film to exclude digital cameras.
To search socialmedia, use the @operatore.g. @Facebook or @twitter
intext:MarsRover Finds web pages that include ‘Mars Rover’ in their text (57.5 million results when I tried this one).
Site:autotrader.co.uk Limits the search to the specified site
for more, there’s the link above, but there are many more, just search for ‘Google Dorks’ and be amazed!
While not Dorks per se, Google have a number of other usefulresources, including images and maps. One I have found to be very handy isGoogle Correlate
Here’s a simple example, I used the term ‘Diet and Exercise’
CheckUserNames
This is potentially a very useful site, the aim is to see whether a givenusernameis available on any of more than 160social networkingsites. The homepage isCheck User Names, If you enter a username, the site will highlight which sites have the name available. It’s a handy way of seeing whether a given username is registered on social media. The site also has a link to their new siteknowem.com, which claims to check more than 500 social media sites I tried some of my social media usernames, and the site (CheckUserNames) did correctly identify the sites where the names have been registered.
The FOCA
FingerprintingOrganizationswith Collected Archives to expand on the (possibly devised to fit the tool’s functionality) acronym.
The main purpose of this tool is to findmetadataand other hidden data contained in a variety of documents and some image files and to show relationships, for example, that some documents may have been produced by the same person or team.
FOCAis aWindowsapplication that requires an instance ofSQLServerto be available for it to store its search results, which may produce a large volume of data.
The FOCA is open source, released under theGNUPublic License (GPL), and is available fordownloadfromthis GitHub page.
Eleven Paths, who are the authors, also have aFOCA market pagewhere you can download several plugins to expand on FOCA’s functionality.
While thinking about what useful information can be found in metadata, another very useful tool is
Metagoofil,
written by Christian Martorella ofEdgeSecurity, here’sMetagoofil’s pageIt’s written forLinuxand can be downloaded as a .tar.gz file from Google Code Archivehere. It’s not been updated since Feb 10 2013, but that may not matter, provided thefunctionalityis correct. Metagoofil ins included in Buscador below, where it can be found in Software → Domains, which opens a small box titled ‘Domains: Choose tool, which containsradiobuttons to choose a tool:
Clicking OK opens another small message with an input field for aURLto search. enter a URL, and
Metagoofil will go off and do its thing. Returning its findings in the form of anhtmlpage in its own folder.
Is anautomatedOSINT tool that gathers freely available information from more than 100 public sources, the type of information thatSpiderfootcan gather includes IP addresses, Email addresses, names, and quite a lot more. It can present the information that it finds in a variery of graphical formats
Spiderfoot is available for Linux,BSD, Solaris and Windows. The Windows version is a freestandingexecutable(.exe file) that appears to be portable as the website says that it comes pre-packaged will all dependencies.
If you don’t want to compile Spiderfoot for Linux, it is also available packaged for Docker.
If you useKali Linux, you’ll know that Spiderfoot is not included in the default Kalidistribution, but it can be installedHere's a link to a tutorial blog post. Note that some browsers may flag this as a site that has been reported as containing harmful software. That’s because it contains links to download Spiderfoot.
This is aLinux distroloaded with OSINT tools, available as aVirtual Machine(VM)image for bothVirtualBoxand VMWare. It is developed and maintained by David Westcott and Michael Bazzell, and hosed onGoogle DriveThe link for the download, and a list of included OSINTapplicationsare available fromIntelTechniqueson theirBuscador page, which also has a helpful list of installation notes, and some helpful notes on using the image in avirtualmachine.
This may be the subject of a future post (or short series of posts) in some detail – watch this space!
Although this is intended primarily as aPenetrationtester’stoolkit, it aims to be the pentester’s ‘SwissArmyKnife),Kalidoes contain several tools of interest to the OSINT investigator, includingMetagoofilandThe Harvester, which is a tool used to gather email accounts and sub-domains from publicly available sources, and a number of other tools. If you've not considered Kali for its OSINT tools, do have a kook, there’s a good amount of video tutorials on YouTube,this link will find them.
There are also a similarly good number of video tutorials covering The Harvester on YouTube,this link will find them
Search Engines
And let’s not overlook the obvious start point for most OSINT (or indeed any otherintelligencegathering) – Search engines, and Google in particular, have become soembeddedin our consciousness that they may not register as the valuable tools that they undoubtedly are. If you doubt that, imagine trying to find a supplier of cheap watches inChinaor Japan (If you’re in Europe or the USA)withoutinternet accessand a search engine.)
Some (like Google,Bingand Yahoo) willtrackyour searches, others, primarilyDuckDuckGomake a point of NOTtrackingyour searches.
Some of the others, Ask for example have changed their underlying technology. Ask, at one time ‘Ask Jeeves) was for a while effectively a re brandedYahoosearch (as is Swagbucks search currently), now however, Ask.com appears to be it’s own thing, and any association with another search engine is not readily apparent. Tracking these changes of ownership might be an interesting OSINT exercise.
Please feel free to do this exercise, and post your results in the comments section under this post.
Some old (And I thought long defunct) search engines are still going likeDog Pile, which used to search several other engines ad return aggregated results.This sitelists several other search engines, and a number of other resources that are useful for OSINT investigations.
Have I Been Pwned
This is a very useful site, just enter anemail addressand the site will tell you whether the email address has beenpwned(found among others in one or more breaches) try it here:Have I Been Pwned.
If you need to check a file for knownviruses, thenVirusTotal is the place to go, here’s theirupload pagewhere you can upload a suspect file for checking. Virus Total also have a number of APIscriptsallowingdevelopersto use the Virus Checking functionality within their applications.
A variety of languages are supported, as are Maltego, in the form of a couple of transforms.
They also have andappforAndroidin the Play Store,desktopapplications for Windows and Mac, but not specifically for Linux, although the Mac application (which uses Qt) can be compiled for you own distribution andbrowser extensionsforChrome,FirefoxandInternet Explorer, but apparently not for Edge yet.
Forums (Fora?)
There’s any number of forums dedicated to most interests (including some that are illegal in most jurisdictions). The best way to find forums, if they’re relevant to your research, is probably to use a search engine (I preferDuckDuckGoas it doesn’t track your activity).
It’s worth using mire than one search, using slightly different terms, as they can sometimes give differing results.
Blogs
there are a number of sites than claim to help you find blogs on any subject, but the one that works for me isThe Blog Search Engine which along with blogs on the chosen subject, will also return YouTube videos and other related sites.
특정 사항이나 raw API를 요약하는 간단한 Helpers functions 컬렉션입니다.
특정 형식에 대한 요구 사항 및 다른 사항으로 인해 직접 사용하는 경우 번거로울 수 있으므로 Bulk API에는 여러 가지 Helpers가 있습니다.
모든 Bulk Helpers는 Elasticsearch 클래스의 인스턴스와 반복 가능한 작업을 수용합니다 (반복 가능하거나 생성자가 될 수 있음). 대량의 데이터 세트를 메모리에 로드하지 않고도 인덱싱 할 수 있습니다.
[2] 사용 예제
import json
import os
from elasticsearch import Elasticsearch
from elasticsearch import helpers
ES_HOST = '127.0.0.1' # set es url or host
ES = Elasticsearch(hosts=ES_HOST, port=9200)
n = 10
##############################
# Bulk api 사용법
# bulk api를 활용해서 불안정한 search api 대신,
# 다량의 데이터를 한번에 전송하기 위해 작성된 예제 코드입니다.
# 코드는 아래와 같은 방식으로 구현되었습니다.
# ==========================================
# 먼저, resultList로 부터 json 포맷으로 저장된 결과들을 list형인 result_dict에 저장합니다.
# 여기서, resultDict에 있는 데이터들이 n개 이상일 경우, 엘라스틱서치에 전송됩니다.
# n개를 넘지않을 경우에는 추가 데이터를 받아오거나,
# 코드가 종료되는 시점에서 나머지 데이터를 전송하는 방향으로 처리하시면 됩니다.
# ==========================================
##############################
def collector():
resultList = [{"target":apple, "taste":"good"}, {"target":banana, "taste": "bad"}]
resultDict = []
for data in resultList:
result = {"_index": ES_index, "_source": data}
resultDict.append(result)
if len(resultDict) > n:
helpers.bulk(ES, resultDict)
resultDict = []
helpers.bulk(ES, resultDict)
An event is a single occurrence within an environment, usually involving an attempted state change. An event usually includes a notion of time, the occurrence, and any details the explicitly pertain to the event or environment that may help explain or understand the event's causes or effects.
위 내용을 빠르게 요약하면 이벤트는 일반적으로 시도된 상태 변화와 관련된 환경에서의 single occurrence (e.g. 단일 발생, 적절한 사례) 이다.
여기서 로그 소스는 두 가지의 일반적인 카테고리로 나눌 수 있다고 한다.
푸시 기반과 풀 기반으로 나뉘는데, 푸시 기반은 장치나 애플리케이션이 로컬 디스크나 네트워크를 통해 메시지를 전송하는 것을 말한다.(e.g. syslog, SNMP, 윈도우 이벤트 로그 - 프로토콜, 전송 메커니즘, 저장, 검색 기능 제공)
그리고 풀 기반은 애플리케이션이 소스에서 로그 메시지를 가져오는 것을 말한다. (클라이언트 - 서버 모델에 의존) 이 방식으로 동작하는 대 다수의 시스템은 로그 데이터를 전용 포맷 형태로 저장한다고 한다.
로그 저장 포맷의 종류는 아래와 같이 크게 3 가지의 종류로 나뉜다.
텍스트 기반 로그 파일
플랫 텍스트 파일 (일반적인 유형: syslog 포맷)
인덱스 플랫 텍스트 파일 (OSSEC라는 로그 보관 유틸리티가 사용하는 전략)
가장 강력한 유틸리티: 아파치 루씬 코어 Apache Lucene Core
루씬은 전체 텍스트를 로깅하고 로그 검색과 분석 유틸리티를 통합하는 인덱스를 생성할 수 있는 자바 기반의 프레임워크를 말한다.
ex. 엘라스틱서치가 루씬 기반이다. 루씬이 제공하는 기능의 대부분 지원하며, 엘라스틱서치를 사용할 시에는 루씬을 사용할때의 불편함을 간소화할 수 있는 장점이 있다. [티몬의 개발이야기 - 인용])
바이너리 파일
일반적인 바이너리 로그 파일 예제: 마이크로소프트 IIS 로그와 윈도우 이벤트 로그
압축 파일
tar, zip 포맷은 오랫동안 사용되어 왔으며, PKZip 포맷 압축 파일은 윈도우 공통 포맷이다.
zgrep과 zcat 도구를 이용하면 grep과 cat으로 압축하지 않은 파일에서 읽는 것처럼 압축 파일에서 데이터를 읽고 검색할 수 있다.
다음으로, 로그를 얻을 수 있는 방법은 무엇이 있을까? 아니면 로그를 얻을 수 있는 자원은 무엇일까? 우리는 아래와 같은 자원에서 우리는 다양한 로그를 얻을 수 있다.
유닉스와 윈도우 시스템
라우터
스위치
방화벽
무선 AP
가상 사설망
안티바이러스 시스템
프린터
그럼 이렇게 다양한 로그를 수집 후, 보관할때 따라야하는 정책들은 무엇이 있을까? 책에서는 다양한 정책들을 보여주었지만, 초반에는 PCI DSS 만을 다루었다. PCI DSS는 Payment Card Industry Data Security Standard 이며, 여기서 다루는 내용은 컴플라이언스 요구사항 평가, 조직의 위험 상태 검토, 다양한 로그 소스의 생성 로그 크기 확인, 가능한 저장장치 선택 검토를 다룬다.
[3] 전송 메커니즘 / 솔루션
대표적으로 로그 전송에서 사용되는 메커니즘 5 가지는 아래와 같다.
syslog UCP/TCP
암호화된 syslog
HTTP 상에서 SDAP
SNMP
FTPS나 SCP와 같은 일반 파일 전송
그리고 로그와 관련된 솔루션은 어떤 것들이 있을까, 이런 솔루션들을 통칭하는 단어는 무엇일까라는 생각이 들었는데, 이 또한 책에 아래와 같이 정리되어 있었다.
elasticsearch의 인덱스에 대한 change 여부를 체크하는 방법에 대한 포스트입니다.
1. 시퀀스 번호를 활용한 추적
시퀀스 번호 체크 방법
http://{ip}:9200/_search?seq_no_primary_term=true
Today this can be solved client side by storing the last sequence number and then polling the shard level stats for the current sequence number; if it is higher, there must have been a change. Closing.
마지막 시퀀스 번호를 저장 한 다음 현재 시퀀스 번호의 샤드 레벨 통계를 폴링하여 클라이언트 측에서 해결할 수 있습니다.
다크웹/딥웹 관련 리서치를 꾸준히 하면서 이와 관련된 데이터 유출 사고에 대해서만 중점적으로 다루는 내용의 포스트가 없었습니다. 해당 포스트에서는 그와 관련된 내용들을 자세히 정리하고, 참고한 자료들에 대한 링크들도 아래에 남겨두었습니다. 또한, 새롭게 일어난 사고에 대해서도 꾸준히 업데이트할 예정입니다.
다크웹 딥웹 관련 데이터 유출 사건 정리
References
620 million accounts stolen from 16 hacked websites now for sale on dark web, seller boasts