{"locale":"en-us","url":"/blog/elastic-stack-7-12-1-released","content":"","title":"Elastic Stack 7.12.1 released","tags":{"product":["elasticsearch","kibana","beats","elastic stack"],"industry":[],"level":[],"use_case":[],"tags":[],"elastic_stack":["blt7bb6b1e9a797738f","bltb5a7ebf330c5002e","blt8b37b4b3ec0fe838","blt3d820a0eae1c9158"],"use_cases":["blt10eb11313dc454f1","blte1906c436045dbef","blt4607298d4fd82c81","bltb249a1eeba77b317","bltcb543cd010a1e2a8"],"topic":[]},"authors":[{"uid":"bltd6be3c9d66f3b266","last_name":"Hall","job_title":"Principal Software Engineer II","company":"Elastic","full_name":"Clint Hall","first_name":"Clint"}],"publish_date":"2021-04-27T18:00:00.000Z","category":["bltfaae4466058cc7d6"]}
{"locale":"en-us","url":"/blog/elastic-and-alibaba-cloud-reflecting-on-our-partnership-and-looking-to-the-future","content":"<p>Alibaba Cloud is an important partner to us here at Elastic. We officially started our collaboration and strategic partnership with Alibaba Cloud <a href=\"/blog/alibaba-cloud-to-offer-elasticsearch-kibana-and-x-pack-in-china\">back in 2017</a>, when we announced the <a href=\"https://data.aliyun.com/product/elasticsearch\">Alibaba Cloud Elasticsearch</a> service. Since then, we’ve seen rapid adoption and growth of the service, which now supports more than 10 petabytes of data. This year, we’ve recognized Alibaba Cloud as our Ecosystem Partner of the Year to acknowledge their contributions in advocating for free and open technology and developing value for our joint customers over the past three years.\n</p><p>A reason for the rapid growth of Alibaba Cloud Elasticsearch is down to Alibaba Cloud’s proactive push to drive the adoption of Elasticsearch and their approach to educating developers and enterprises on how to optimize their use of Elastic technology.&nbsp;\n</p><p>To assist our joint customers during the pandemic, Alibaba Cloud engineers and Elastic evangelists have launched several free technical courses, targeting developers, students, and engineers to help them accelerate their shift to digital business.\n</p><p>Alibaba Cloud has also helped Elastic to broaden the boundaries of free and open technology. We’re seeing that Alibaba Cloud Elasticsearch users are expanding beyond traditional data retrieval and log analysis use cases into more business-oriented applications. And as Alibaba Cloud makes free and open technology more convenient and affordable, Alibaba Cloud Elasticsearch has attracted more conventional business adopters, including traditional retail companies, logistics, finance, and manufacturing.&nbsp;\n</p><p>Underpinning our partnership’s success is how together, Alibaba Cloud and Elastic sit at the intersection of search, open development, and cloud-native technologies.&nbsp;\n</p><p>At Elastic, we focus on building free and open search technology that is easy to use and easy to manage, with open APIs that easily plug into orchestration systems. And, with the shared vision that search is foundational to cloud-native efforts, the Alibaba Cloud team delivers our products to customers as a best-in-class cloud experience with one-click deployment and scaling. When we release new features and capabilities in our products, Alibaba customers immediately have access to them. Together Elastic and Alibaba Cloud take care of the underlying technology, enabling users to focus on their business and quickly and easily scale their infrastructure without having to scale up their teams.&nbsp;\n</p><p>“Thank you Elastic, for your trust and recognition of Alibaba Cloud,” said Yangqing Jia, Vice President of Alibaba Group and Senior Fellow of the Computing Platform BU. “The successful cooperation between our two companies is a paradigm of partnership and co-prosperity between cloud native and open ecosystems. At the same time, it creates a best-in-class blueprint for open source and cloud native product unity. Alibaba Cloud actively embraces ecosystem technology partners such as&nbsp;Elastic to build a larger Alibaba Cloud business ecosystem.”\n</p><p>From the Elastic team to the Alibaba Cloud team: We want to extend our congratulations and sincere thanks for your partnership and collaboration. We’re proud of the strong relationship and the momentum we’ve built together, and there is still plenty for us to accomplish. Here’s to the next three years and more!\n</p>","title":"Elastic and Alibaba Cloud: Reflecting on our partnership and looking to the future","tags":{"product":[],"industry":[],"level":[],"use_case":[],"tags":[],"elastic_stack":[],"use_cases":[],"topic":[]},"authors":[{"uid":"blt6796ce51614bf3c4","last_name":"Khushani","job_title":"","company":"","full_name":"Pankaj Khushani","first_name":"Pankaj"}],"publish_date":"2021-04-26T15:00:00.000Z","category":["blt0c9f31df4f2a7a2b"]}
{"locale":"en-us","url":"/blog/searching-logs-free-open-logs-app-kibana","content":"<p>Log exploration and analysis is a key step in troubleshooting performance issues in IT environments — from understanding application slow downs to investigating misbehaving containers. Did you get an alert that heap usage is spiking on a specific server? A quick search of the logs filtered from that host shows that cache misses started around the same time as the initial spike. Digging into the metadata (date and version) of the highlighted logs from that time period show us that this was likely due to a recent code push. As seen here —  and in many other cases — logs hold the clues to what was happening during, before, or after an issue, which can help identify where to focus our attention next.\n</p><p>The Elastic Stack has&nbsp;<a href=\"https://www.elastic.co/customers/success-stories?usecase=elastic-observability\">long been a favorite for log management and log analytics</a> because it’s built for both speed (which shortens investigations) and scale (which alleviates concerns about dropping data). And, with the <a href=\"https://www.elastic.co/guide/en/ecs/current/ecs-getting-started.html\">Elastic Common Schema</a>, event data is normalized so you can better analyze, visualize, and correlate across all types of events.\n</p><p>There are many ways to explore your logs in Kibana; you can look at them in the Discover app, create custom charts and dashboards, and use tools like <a href=\"https://www.elastic.co/kibana/kibana-lens\">Lens</a> to create visualizations in just a few clicks. In addition, the <a href=\"https://www.elastic.co/log-monitoring\">Elastic Logs app</a> is a powerful tool for searching, filtering, visualizing, and tailing your logs directly in Kibana. Think of it like a souped-up <code>tail -f</code> for logs from your entire environment with powerful search and filtering capabilities. The Logs app is part of the free and open distribution of <a href=\"https://www.elastic.co/observability\">Elastic Observability</a> for a frictionless getting started experience without limits on ingest, users, or anything else.\n</p><h2>See a streaming view of your logs (tail your log files)</h2><p>As logs start flowing in from components across your environment, one of the first things you will want to see is the stream of these events as they happen. The Logs app combines the familiar view you’d see in a terminal window with the <a href=\"https://www.elastic.co/kibana\">on-demand analysis capabilities of Kibana</a>. Watch the events stream in live and zoom in on specific log lines to view the details.\n</p><h2>Customize columns with available fields</h2><p>Log events are rich with detail across a range of fields — any one of which might hold the next clue in your investigation. However, that metadata can clutter our window when it’s not relevant to our specific issue. <a href=\"https://www.elastic.co/guide/en/observability/current/configure-data-sources.html#customize-stream-page\">Choose the fields you want to see</a> so the information you need is displayed on screen while the granular details are still at your fingertips. Include or exclude fields like <code>timestamp</code>, <code>message</code>, and <code>host.ip</code>, or slice and dice to make custom columns with <a href=\"https://www.elastic.co/guide/en/elasticsearch/reference/master/runtime.html\">runtime fields</a>.\n</p><h2>Search (and filter) across all of your logs</h2><p>Pinpoint the relevant logs with plain text search or auto-completing keywords (and values) via the search bar. For instance, you can search for logs that contain the word “error.” As you start to type, the <a href=\"https://www.elastic.co/guide/en/kibana/current/kuery-query.html\">Kibana Query Language</a> provides suggestions on which fields to search. Use the “Highlights” function to jump to relevant logs based on a keyword. With the filter still in place, we can further sift through the events. The Logs app will highlight your requested term. Save time by jumping directly to the log lines that are pertinent to your investigation.\n</p><h2>View your logs in context</h2><p>A data point in isolation is a valuable indicator, but a data point in context tells a story. Now that you’ve pinpointed the concerning logs, take a step back and get a sense for what was happening around that time in the specific application or container. What happened just before and after? Use this context to identify root causes faster and lower MTTR.\n</p><h2>Try the free Logs app for yourself</h2><p>Follow along with this video to see the app in action.\n</p><!-- The script tag should live in the head of your page if at all possible --><!-- Put this wherever you would like your player to appear --><p><img class=\"vidyard-player-embed\" src=\"https://play.vidyard.com/yp7G1RNRaZ6zQB694pAJTJ.jpg\" data-uuid=\"yp7G1RNRaZ6zQB694pAJTJ\" data-v=\"4\" data-type=\"inline\" style=\"width: 100%; margin: auto; display: block;\">\n</p><p><br>\n</p><p>Start analyzing your logs with a simple <a href=\"https://www.elastic.co/blog/getting-started-with-free-and-open-elastic-observability\">download of Elasticsearch and Kibana following these best practices</a>. Play with <a href=\"https://www.elastic.co/guide/en/kibana/current/get-started.html#gs-get-data-into-kibana\">sample data</a>, start shipping your logs with one of the hundreds of out-of-the-box <a href=\"https://www.elastic.co/integrations\">integrations</a>, or ship the logs from your custom applications and services, and use <a href=\"https://www.elastic.co/guide/en/elasticsearch/reference/master/runtime.html\">runtime fields</a> to split things up later.\n</p><p>Everything mentioned here is also available on <a href=\"https://www.elastic.co/cloud/\">Elastic Cloud</a> for a fully-managed experience.\n</p>","title":"Searching through logs with the free and open Logs app in Kibana","tags":{"product":["logs"],"industry":[],"level":["introduction"],"use_case":[],"tags":[],"elastic_stack":[],"use_cases":[],"topic":[]},"authors":[{"full_name":"Renuka Gough","uid":"blta2dff1fc0b99cada","last_name":"Gough","first_name":"Renuka"}],"publish_date":"2021-04-28T16:00:00.000Z","category":["blt1d90b8e0edce3ea9"]}
{"locale":"en-us","url":"/blog/elastic-for-education-how-we-support-students-and-educators","content":"<p>Launched in May 2020, the <a href=\"/community/students-and-educators\">Elastic for students and educators</a>&nbsp;program provides the resources and support necessary to bring our products and solutions into classrooms around the world. Whether it’s through provisioning <a href=\"/cloud/\">Elastic Cloud</a> accounts for students to use during their course, extending access to premium features like&nbsp;machine learning to propel research, or working with universities to design partnership programs to help prepare students for their future careers, Elastic is committed to education and inspiring the future workforce.&nbsp;\n</p><h2>An educator uses Elastic Cloud to power data visualization</h2><p><a href=\"https://www.iit.comillas.edu/people/lfsanchez\">Luis Francisco Sanchez Merchante</a>, Assistant Professor in the Department of Telematics and Computer Science at Comillas University in Madrid, Spain, is part of the Elastic Cloud for education program that extends free cloud instances to students and educators for academic use. Luis explained how this program helps power improved data visualization in his course.&nbsp;\n</p><p>In his own words:&nbsp;\n</p><blockquote>We introduced Elastic products with the purpose of improving our data visualization programs. We intended to provide our students with a differential industry-oriented education. Most visualization courses in the market focus on multi-purpose applications implementing use cases based on data files. Multi-purpose visualization tools are amazing. We indeed train our students on their use. Introducing new sessions with tools like Kibana, we provide our students with a wider knowledge that will allow them to have better criteria to decide the best alternatives depending on the use case. I have complete faith that the use of actual data pipelines with ingestion tools like Filebeat or Logstash and big data databases like Elasticsearch prepares our students for real industrial environments.<br><br>\nIn addition to these benefits, using an Elastic Cloud deployment has been and still is of great help. Recent limitations due to mobility restrictions have proved that cloud deployments are the best option for academia. They have proved to be a great alternative to the university's on-premises resources when a significant proportion of students attended our sessions from home.<br>\nEvery six months we anonymously survey our students and we get confirmation that our students understand the usefulness of data visualization. That’s why this program today registers around 250 students when only two years ago there were only 50 of them.\n</blockquote><p><em><img src=\"https://images.contentstack.io/v3/assets/bltefdd0b53724fa2ce/blt8a65b02ae165f3ca/606f3c3e4662f62195199121/Imagen1.png\" data-sys-asset-uid=\"blt8a65b02ae165f3ca\" alt=\"Kibana dashboards created by students in Luis’ course.\" \"=\"\"><br></em>\n</p><figcaption>Kibana dashboards created by students in Luis’ course.</figcaption><h2>Elastic machine learning powers a student’s final year project</h2><p>Thomas Bisof, a BSc. Ethical Hacking and Cybersecurity student at Coventry University, was granted a free license extension to power the research being used in his final year project.\n</p><p>In his own words:&nbsp;\n</p><blockquote>I'm using the Elastic Stack, primarily Elasticsearch, Kibana, and various Beats. These tools are being used as part of my final year project: \"Utilising the Elastic Stack and its anomaly detection capabilities to identify known attack vectors.\" In the absence of a research grant, Elastic supported the project with an extended trial.<br><br>\n\tI first heard about Elastic prior to starting University (2017), when I was running my Minecraft server It was introduced to me by researching within the area and word-of-mouth within the community.<br><br>\n\tElastic is very simple and easy to use. Whenever I faced issues with access control once I had enabled the community helped to quickly resolve them. Furthermore, the Kibana front end is beautiful and a delight to use.<br><br>\n\tI believe that having experience with Elastic makes me more employable. I feel confident knowing that I have used an industry standard tool.<br><br>\n\tThe community and staff members have been great.\n</blockquote><h2>University students receive hands-on training through a partnership with Elastic</h2><p>Led by Jongmin Kim, Elastic Senior Developer Advocate, Elastic’s partnership with SangMyung University (SMU) in South Korea launched in late 2020. The partnership provides students and educators with an opportunity to engage with Elastic products and solutions through in-person training, project mentoring, and the creation of a dedicated resource library to support student skills development. Students who participate in the program are also given the opportunity to apply for internships with Elastic partner companies in the region that are looking for new talent to help support their business.&nbsp;\n</p><p>While industry and university partnerships are nothing new to the region, they are usually with larger global corporations that have a dedicated program to support the initiative. However, SMU saw value in Elastic and wanted to work with a smaller company due to the high demand for graduates with skills in Elastic.&nbsp;\n</p><p>Some of the events hosted to date as part of the partnership include:&nbsp;\n</p><ul>\n\t<li aria-level=\"1\">3-part workshop series introducing students to Elasticsearch and Kibana</li>\n\t<li aria-level=\"1\">Mentoring support for students’ course project which utilized Kibana</li>\n\t<li aria-level=\"1\">Elastic career seminar</li>\n\t<li aria-level=\"1\">4-day consulting workshop — over 100 students registered, but only 21 could attend due to Covid-19 restrictions\n\t<ul>\n\t\t<li aria-level=\"2\">Partners participated in workshop to help identify students for internships</li>\n\t</ul></li>\n\t<li aria-level=\"1\">3 students presented at our recent Elastic Community Conference about their use of Elastic in their course</li>\n</ul><p><img src=\"https://images.contentstack.io/v3/assets/bltefdd0b53724fa2ce/blt16170b1b8fc762c8/606f3cb44662f62195199137/blog_image_2.png\" data-sys-asset-uid=\"blt16170b1b8fc762c8\" alt=\"blog_image_2.png\"><br>\n</p><figcaption>Students from SMU participating in a 4-day workshop series.</figcaption><p>This may be the first partnership of its kind in the region, but Elastic sees this as a model for other universities interested in bringing Elastic into their classrooms.&nbsp;\n</p><h2>Interested in working with Elastic?</h2><p>These are just a few of the ways in which Elastic impacts education by engaging students and educators with our products and solutions. We are always looking for new opportunities to work with universities around the world and help to educate and inspire students interested in learning more about Elastic. If you are interested in working together, please <a href=\"mailto:students_highered@elastic.co\">reach out</a> to start the conversation!\n</p>","title":"Elastic for Education: How we support students and educators","tags":{"product":["elastic cloud","kibana","machine learning"],"industry":[],"level":[],"use_case":[],"tags":[],"elastic_stack":[],"use_cases":[],"topic":[]},"authors":[{"uid":"bltf0d611e5d0023d98","last_name":"Nissen","job_title":"","company":"","full_name":"Stephanie Nissen","first_name":"Stephanie"}],"publish_date":"2021-04-28T15:00:00.000Z","category":["blt26ff0a1ade01f60d"]}
{"locale":"en-us","url":"/blog/culture-career-stories-tom-wilde-and-jan-kumorowicz-on-growing-their-sales-careers-with-cloud","content":"<p>What does a career journey into the inside sales team look like at Elastic? We talked to two of our own — Tom Wilde and Jan Kumorowicz — about moving from user success manager roles into cloud sales, and what it’s like to be at the start of a customer’s journey.</p><h3>Tom Wilde, inside cloud account executive</h3><p><img src=\"https://images.contentstack.io/v3/assets/bltefdd0b53724fa2ce/bltce03674aa872d454/608a8f7db9440f10206ea675/Tom_Wilde.jpg\" data-sys-asset-uid=\"bltce03674aa872d454\" alt=\"Tom Wilde\" style=\"display: block; margin: auto;\"></p><p>After about two years as a user success manager, I began to master the art of outbound sales — cold calling potential enterprise customers, working to find new leads via LinkedIn, and having proof-of-concept conversations. But I was eager for the chance to learn how to make sales from inbound leads. My manager encouraged me to apply for an opening on the inside cloud account team and a more senior role. <a href=\"https://www.elastic.co/cloud/?baymax=web&elektra=culture-career-stories-tom-wilde-and-jan-kumorowicz-on-growing-their-sales-careers-with-cloud\">Cloud offers so many products and services for our customers</a>, including Enterprise Search, observability, and security. I was excited to learn more, and deliver greater impact for our customers, in what I believe is the future of our business. From a career standpoint there’s no better place to be.</p><p>My day-to-day tasks in the role include helping create opportunities for customers spinning up their first <a href=\"https://cloud.elastic.co/registration?elektra=en-cloud-page?baymax=web&elektra=culture-career-stories-tom-wilde-and-jan-kumorowicz-on-growing-their-sales-careers-with-cloud\">Elastic Cloud trial</a>. We’re here to try and break down some of the barriers that might prevent them from being a paid customer, and get them to that place within two months of their first contact with us. To do that, I have a chat with them, get to know what they’re hoping to accomplish with <a href=\"http://elastic.co/products/?baymax=web&elektra=culture-career-stories-tom-wilde-and-jan-kumorowicz-on-growing-their-sales-careers-with-cloud\">our products</a>, and get them the resources they need to succeed and continue working with us. I’m also there if these new customers run into technical problems and need assistance.</p><p>We have an opportunity to convert a wide range of customers. As an enterprise user success manager, I was working with big clients. Now, I get to create solutions for all sorts of use cases — even small businesses that want to use Elastic Cloud for their website search. We get these customers started on a long-term journey with Elastic, which has a huge impact as they continue on and renew. I think that’s the most exciting part of this job — the opportunity to have these early conversations. Since the formation of our team we’ve had quite a few competitive wins, and I think it’s because of our work as a first touchpoint in the customer journey.</p><p>When I became an inside cloud account executive, I got a lot of training and support to tackle new responsibilities. One thing that’s new to this role is a need for technical savvy, which includes becoming certified in cloud technology. For me, that meant completing an AWS fundamentals course right away. Learning these skills allows us to help our customers leverage all the benefits of the cloud platform they’re deploying from. To be successful in this role you’ve got to be interested in learning new things. You have to understand how people are interacting with data, and knowing how our products can solve their problems with the support of other teams. Every day is a school day.</p><h3>Jan Kumorowicz, inside cloud account executive</h3><p><img src=\"https://images.contentstack.io/v3/assets/bltefdd0b53724fa2ce/blt015368dfde5e7885/608a8f933aa0431020f5de88/Jan_Kumorowicz.jpg\" data-sys-asset-uid=\"blt015368dfde5e7885\" alt=\"Jan Kumorowicz\" style=\"display: block; margin: auto;\"></p><p>I started as a user success manager right out of college. Actually, I skipped my graduation so that I could start a week early — that’s how excited I was to join Elastic. I moved into the inside cloud account executive role about a year ago. At Elastic, you’re given the freedom to build your career as you envision it. What’s great is that as you’re building that vision, everyone at Elastic is generous with their time. They’ll always give you an opportunity to talk and help you see the vision for a team, which makes it easier to find your way.</p><p>I chose this role because I was really interested in the onboarding process. The inside sales role offers a lot of options for people wanting to go into sales, without being pigeonholed. If you want to pursue a career that’s focused on the inside sales motion or high velocity sales, if you like building relationships with cloud vendors, or if you’re interested in moving into management early in your career, the onboarding route gives you a firm grounding in all those things.</p><p>I think the most exciting part of this role is speaking to potential customers. We’re here to make sure they’re successful with their <a href=\"https://cloud.elastic.co/registration?elektra=en-cloud-page?baymax=web&elektra=culture-career-stories-tom-wilde-and-jan-kumorowicz-on-growing-their-sales-careers-with-cloud\">14-day trials</a>. I personally love working with small companies, because often they’re betting their whole company’s future on using our products. That’s a really humbling experience, and I think the best part of my day-to-day tasks — understanding how we can provide the most relevant materials to our customers to make sure they’re successful.</p><p>I think this role really embodies a startup within a startup mentality. We’re thinking about how to improve the process for potential customers, remove roadblocks, and make the process as smooth as possible. That might mean setting up an AMA, or an onboarding session at a meetup. It takes creativity. You’re at the center of a pivotal moment in the customer’s journey — it’s exciting to collaborate with other departments and see where that can go.</p><p><em>Interested in joining Elastic? We’re hiring. Check out <a href=\"https://www.elastic.co/about/teams/?baymax=web&elektra=culture-career-stories-tom-wilde-and-jan-kumorowicz-on-growing-their-sales-careers-with-cloud\">our teams</a> and <a href=\"https://www.elastic.co/about/careers/?baymax=web&elektra=culture-career-stories-tom-wilde-and-jan-kumorowicz-on-growing-their-sales-careers-with-cloud\">find the right career for you!</a> Want to read more about life at Elastic? Read more on <a href=\"https://www.elastic.co/blog/category/culture?baymax=web&elektra=culture-career-stories-tom-wilde-and-jan-kumorowicz-on-growing-their-sales-careers-with-cloud\">our blog</a>!</em></p>","title":"Elastic Career Stories | Tom Wilde and Jan Kumorowicz on growing their sales careers with cloud","tags":{"product":[],"industry":[],"level":[],"use_case":[],"tags":[],"elastic_stack":[],"use_cases":[],"topic":[]},"authors":[{"full_name":"Elastic Culture","uid":"blt7fc3768df8cad1f6","last_name":"Culture","first_name":"Elastic"}],"publish_date":"2021-04-29T15:00:00.000Z","category":["bltc253e0851420b088"]}
{"locale":"en-us","url":"/blog/the-essentials-of-central-log-collection-with-wef-wec","content":"<p>Last week we covered <a href=\"https://www.elastic.co/blog/the-essentials-of-windows-event-logging\">the essentials of event logging</a>: Ensuring that all your systems are writing logs about the important events or activities occurring on them. This week we will cover the essentials of centrally collecting these Event Logs on a Window Event Collector (WEC) server, which then forwards all logs to Elastic Security. </p> <strong><h2><strong>WEF and WEC</strong></h2></strong> <p>Modern versions of Windows include the <a href=\"https://docs.microsoft.com/en-gb/windows/win32/winrm/portal\">Windows Remote Management (WinRM)</a> services that implement the <a href=\"https://docs.microsoft.com/en-gb/windows/win32/winrm/ws-management-protocol\">WS-Management (WSman) protocol</a>, and just to add to the acronym spaghetti soup, this is all part of <a href=\"https://docs.microsoft.com/en-us/windows/win32/wmisdk/wmi-start-page\">Windows Management Instrumentation (WMI)</a>. One component of WinRM is the Windows Event Forwarding (WEF) service, this is why WinRM and co. need to be enabled. WEF can forward Windows Event Logs to a Windows Server running the Windows Event Collector (WEC) service. </p> <p>There are two modes of forwarding: </p> <ol> <li aria-level=\"1\">Source Initiated: The WEF service connects to the WEC server</li> <li aria-level=\"1\">Collector Initiated: The WEC service connects to the WEF service</li> </ol> <p>Both use WSman to forward the logs and require WinRM to be running. </p><p><img src=\"https://images.contentstack.io/v3/assets/bltefdd0b53724fa2ce/bltea2bc54141b72a27/60833606ea92433fc51395a4/1-wsman-log-forwarding-blog-essentials-window-event-logging.png\" data-sys-asset-uid=\"bltea2bc54141b72a27\" alt=\"1-wsman-log-forwarding-blog-essentials-window-event-logging.png\"></p>  <p>There are a number of pitfalls and hurdles when setting up WEF and WEC. Following our <a href=\"https://ela.st/tjs-wec-cookbook\">WEC Cookbook</a>, you can avoid these. However, for a higher-level view with richer context, we will discuss them here, as well as the solution taken in the Cookbook. </p> <strong><h2><strong>‘Forwarded Events’ event log file</strong></h2></strong> <p>In the Windows Event Log system there are Channels. These Channels are ultimately backed by an event log file that stores all the event logs written to that Channel. A Windows system comes with a set of predefined Channels and applications can add their own Channels by registering new “Providers.” </p> <p>This means that out of the box, a WEC server only has the Channels that a normal Windows server has anyway for its own logs. Then where should one store all the logs that are being forwarded to the WEC server? There are three options; let's look at them: </p> <p><strong>1. </strong>Store in the local Channel matching the remote Channel (i.e., the remote “Security” Channel events are stored in the WEC’s local “Security” Channel).&nbsp; </p> <p>Pitfalls: </p> <ul> <li aria-level=\"1\">All your remote logs are mixed with your local logs&nbsp;</li> <li aria-level=\"1\">The WEC server may loop its own event logs to this Channel&nbsp;</li> <li aria-level=\"1\">Log management and access control are made very difficult</li> </ul> <p><strong>2.</strong> Store all the remote logs in the local “Forwarded Events” Channel. </p> <p>Pitfalls: </p> <ul> <li aria-level=\"1\">Poor write performance, since all writes are to a single file</li> <li aria-level=\"1\">Poor search/read performance, as events are not partitioned in separate files</li> <li aria-level=\"1\">Poor data life cycle management, as this is per log file, therefore all forwarded events are treated as equal</li> <li aria-level=\"1\">Poor resource utilisation of the WEC server, because all work is bottle-necked to a single file</li> <li aria-level=\"1\">Poor access management, separate files would allow differentiated file access controls</li> <li aria-level=\"1\">Poor coverage/visibility — due to the issues above, many heavily restrict what event logs are forwarded, leaving gaps in their visibility</li> </ul> <p><strong>3.</strong> Create new Channels for the WEC server.&nbsp; </p> <p>This is not as obvious as it might seem, and most would be forgiven for not knowing that it was an option.&nbsp; </p> <p>Many WEC servers have been set up with options 1 or 2 (above), until Microsoft's own internal Security team wrote a blog post (about 15 years ago) on how they used the Windows SDK to implement option 3. Here is a similar <a href=\"https://docs.microsoft.com/en-gb/archive/blogs/russellt/creating-custom-windows-event-forwarding-logs\">revision posted in 2016</a>. </p> <strong><h2><strong>New WEC event Channels</strong></h2></strong> <p>Armed with the ability to create arbitrary event Channels, what should we create? How should we organise and architect our WEC server? There are many schools of thought — the <a href=\"https://ela.st/tjs-wec-cookbook\">WEC Cookbook</a> groups enterprise assets together so that you can manage your log’s access control and data lifecycle accordingly. </p> <p>Before we delve deeper, let’s look at another approach. Some of you might have come across <a href=\"https://github.com/palantir/windows-event-forwarding\">Palantir’s WEC architecture and guideline</a>. Here, they <a href=\"https://github.com/palantir/windows-event-forwarding/tree/master/windows-event-channels\">created a Channel</a> per event log type: Powershell, WMI, DNS, Firewall, etc. These Channels contain logs from all asset types (Domain Controllers, Domain Server, Domain Workstation) and Departments/Biz-Units/OU. They’re also not organised into a hierarchy, so they’d just be a long list in Event Viewer. The Channels therefore also have their <a href=\"https://github.com/palantir/windows-event-forwarding/tree/master/wef-subscriptions\">WEC subscriptions</a>. Palantir also has a recommended<a href=\"https://github.com/palantir/windows-event-forwarding/tree/master/group-policy-objects\"> audit policy</a>.&nbsp; </p> <p>I like to point out other approaches, such as this one from Palantir, as not one size fits all and their approach might suit your organisation better then the one set out in our Cookbook. </p> <p>If you have looked at Palantir’s Channel list, you will have noticed their “WEC#-Something” format, where the number ‘#’ increases every seven Channels. This is because in the Windows Event Log system, Channels are defined by what is known as a “Provider” and can only define up to eight Channels:</p><p><img src=\"https://images.contentstack.io/v3/assets/bltefdd0b53724fa2ce/bltdcf5ed3da7b4f2f0/608335e592f0063e5c0723de/2-channels-9-blog-essentials-window-event-logging.png.png\" data-sys-asset-uid=\"bltdcf5ed3da7b4f2f0\" alt=\"2-channels-9-blog-essentials-window-event-logging.png.png\"><br></p>  <p>However, an off-by-bug in the “ecmangen” tool that all of us non-Windows-SDK-developer security people used made it frustrating to have more than seven Channels per Provider in it. </p><p><img src=\"https://images.contentstack.io/v3/assets/bltefdd0b53724fa2ce/blt35e47ef407f9fcf6/608335f2b35a7a3c69a385d4/3-channels-8-blog-essentials-window-event-logging.png.png\" data-sys-asset-uid=\"blt35e47ef407f9fcf6\" alt=\"3-channels-8-blog-essentials-window-event-logging.png.png\"></p>  <p>Instead of fixing the many bugs in it, it appears that Microsoft dropped ecmangen from the Windows SDK. Meaning you either use an older SDK or create the Manifest XML file yourself — perhaps with your favourite XML editor. </p> <p>Like everyone else, I used ecmangen (originally) and stuck to seven Channels to make life easier. The current <a href=\"https://ela.st/tjs-wec-cookbook\">Cookbook</a> is based on PowerShell scripts that generate the XML. Ecmangen is no longer needed, so you are free to have eight Channels if you want, although the Cookbook still only uses seven recommended Channels. </p> <p>Note: Think of a Provider as a box of eight Channels, where each Channel is ultimately a separate log file. </p> <strong><h2><strong>Organise by asset</strong></h2></strong> <p>Taking advantage of the fact that you can have as many Providers as you want, we can use this to organise by asset. In your AD environment, you have probably already grouped domain members by asset type (e.g., Domain servers, Domain Controllers, workstations, etc.) and/or by the department (dare I say organisational unit) that those assets are in. </p> <p>So in <a href=\"https://ela.st/tjs-wec-cookbook\">the Cookbook</a> you can easily create Providers to match your AD’s organisation, along the lines of Providers per department (Biz Unit or OU) and/or per Asset type and/or per Asset criticality (Lab/Test/Production). </p> <p>When you seperate by asset type you get the added benefit of being able to better manage access control and log life cycle. If you need to look at the logs of a specific asset type you know where they are. </p> <p>To make things simple all Providers then get the same set of (up to eight Channels). Then systems are mapped to a Provider (via OU) and the event logs on that system map to Channels in that Provider. </p> <p>Out of the box, the “wec_config.ps1” script that is used to configure the WEC server to match your AD architecture has some Providers and asset assignment defined: </p> <ul> <li aria-level=\"1\">Domain Controllers: I think member assignment is clear here</li> <li aria-level=\"1\">Domain Servers: The servers in your domain</li> <li aria-level=\"1\">Domain Clients: User workstations (desktop/laptops)</li> <li aria-level=\"1\">Domain Privileged: More privileged systems (e.g., Jumphosts or WEC Server)</li> <li aria-level=\"1\">Domain Members: Catch-all for normal domain members; not in the other groups</li> <li aria-level=\"1\">Domain Misc: Miscellaneous, for those hosts that don't fit</li> </ul> <p>You are encouraged to refine and edit the list to match your AD environment. </p> <p>Then the out-of-the-box Channel list is: </p> <ul> <li aria-level=\"1\">Application: “Application” and similar logs</li> <li aria-level=\"1\">Security: “Security” and similar logs</li> <li aria-level=\"1\">Sysmon: “Microsoft-Windows-Sysmon/Operational”</li> <li aria-level=\"1\">System: “System”, “HardwareEvents”, DNS-Client, DHCP-Client, “Setup” and similar logs</li> <li aria-level=\"1\">Script: “Windows PowerShell” and similar logs</li> <li aria-level=\"1\">Service: DNS-Server, DHCP-Server, and other service logs</li> <li aria-level=\"1\">Misc: Any other miscellaneous logs</li> </ul> <p>Again, you are free to change this to suit your needs. </p> <strong><h2><strong>WEC subscriptions</strong></h2></strong> <p>A WEC subscription defines the following: </p> <ul> <li aria-level=\"1\">An event log (XPath) filter, selecting what events should be forwarded</li> <li aria-level=\"1\">A destination Channel, stating where to store the received events on the WEC server</li> <li aria-level=\"1\">Type:<ul> <li aria-level=\"2\">Collector Initiated, the WEC connects to the WEF service<ul> <li aria-level=\"3\">Target computers, a list of computers to connect to</li> </ul></li>  <li aria-level=\"2\">Source Initiated, the WEF connects to the WEC server<ul> <li aria-level=\"3\">Computer groups, the AD groups whose (computer) members may access this subscription</li> </ul></li>  </ul></li>  <li aria-level=\"1\">Event delivery options to control bandwidth/latency and/or HTTP/HTTPS</li> <li aria-level=\"1\">Format type: RenderedText or just the Event XML</li> </ul> <p>The cookbook scripts (notably setup_subscriptions.ps1) configure the Event XML format type. These are much smaller, so it means more throughput, more logs stored, less load, and less bandwidth. However, the drawback is that if the source Provider (on the remote system) is not registered locally on the WEC server’s Event Log System, then the Event Viewer won't be able to display a text description of the message in your local language. However, sending Rendered Text events is so resource intensive that it’s difficult to justify. </p> <p>I mentioned at the start that WEF is a function of WinRM. Well, this WinRM component runs as the local system’s “Network Service” user. This means that WEF can’t actually read most of your system logs, and your WEC server will receive a very nondescript Event ID 111 message and no other logs. For this reason, the Cookbook guides you through creating a GPO to add “Network Service” to the local “Event Log Readers” group. </p> <p>How does WinRM get the configuration for WEF? In the same GPO as mentioned above, we also publish a WSman URL that lists all the WEC subscriptions on that server. In fact, we can list multiple WSman subscriptions URLs from multiple WEC servers, and the WEF service will try to get and execute them all — thus allowing for redundant WEC servers. </p> <p>All the subscriptions? I don’t want my Workstation sending event logs to my Domain Controller log files! The WSman entries that represent a subscription have AD group permissions applied to them as set up in the Subscription configuration. This means if the computer that WEF is running on is not a member of an AD group that has permission to read the subscription, it can’t get and execute the subscription. It also means that if you’re not careful and a computer is a member of more than one WEC subscription AD group, you will get multiples of the same event log from that WEF host on your WEC! </p> <p>Computer Groups? But I want to map computers based on the OU that they are placed in! Unfortunately, that is not how WEF/WEC/WinRM/WSman work. However, the Cookbook provides a mechanism to keep a given group’s membership in sync with specified OU locations. Thus you can pretend everything is done via OU! </p> <strong><h2><strong>Bringing it all together</strong></h2></strong> <p>There is a lot of complexity and moving parts to get right when setting up WEF & WEC for good observability or security use cases. </p> <p>Fear not, however, for our <a href=\"https://ela.st/tjs-wec-cookbook\">Cookbook</a> is here, accompanied by a set of <a href=\"https://github.com/ElasticSA/wec_pepped\">Powershell scripts</a> to automate most of the steps. This means there is less room for mistakes, actions are reproducible, and thus mistakes are more easily fixable. </p> <p>It all starts with the wec_config.ps1 script, which you are expected to edit to your heart's content. All the following scripts will take their lead from that. Therefore, you can easily, for example, change the event log filter used to select what event logs get forwarded in wec_config.ps1, and then re-run setup_subscriptions.ps1 to apply the change. </p> <p>Let's take a look at what the scripts do (the Cookbook goes into far more detail on how to use them): </p> <ul> <li aria-level=\"1\">wec_config.ps1 - The configuration of your WEC server, sourced by the other scripts</li> <li aria-level=\"1\">gen_manifest.ps1 - This will create the Manifest XML that describes all your Providers and their Channels for the Windows SDK (no need to use ecmangen anymore!)</li> <li aria-level=\"1\">build_man2dll.ps1 - Taking your manifest, this will build the Windows Event Subsystem Module DLL that implements all your new Providers and Channels on any system you install it (usually the WEC server)</li> <li aria-level=\"1\">install_channels.ps1 - Takes the DLL and Manifest and installs them on the Local system</li> <li aria-level=\"1\">configure_channels.ps1 - Will apply the Log Path and Log Size configuration (from wec_config.ps1) to all your newly installed Channels</li> <li aria-level=\"1\">setup_subscriptions.ps1 - Will setup (create or reconfigure) all the subscriptions for your Provider/Channels on the WEC server</li> <li aria-level=\"1\">map_ou2group.ps1 - You probably want to use your AD’s OUs, but WEC Subscriptions select computers via AD Groups. This script will sync the membership of given groups to the computers under specified OUs, again using the configuration in wec_config.ps1</li> <li aria-level=\"1\">gen_winlogbeat_config.ps1 - The config that ships with Winlogbeat won’t know about all your extra WEC subscription Channels, so this will update that configuration for you</li> <li aria-level=\"1\">beat_cmd.ps1 - A helper script for interacting with Beat commands on PowerShell</li> </ul> <p>Unfortunately, all the configuration on the AD side, such as Group Policies, still have to be done manually — but the <a href=\"https://ela.st/tjs-wec-cookbook\">Cookbook</a> has step-by-step guides with screenshots. I may write scripts for that too one day, stay tuned. </p> <p>Finally Winlogbeat needs to be configured to send all the WEC logs to Elastic Security. The cookbook will guide you through this too. </p> <strong><h2><strong>Conclusion</strong></h2></strong> <p>I hope after reading this blog post, and possibly <a href=\"https://ela.st/tjs-wec-cookbook\">the Cookbook itself</a>, you have a good idea of the decisions you need to make before you start, as well as now having all the guidance and tools you need to create that perfect WEC server for your enterprise. </p> <p>Now that you have the proper audit policies in place, WEF configured, and a WEC server setup to forward your AD domain’s event logs to Elastic Security, in our following blog post we will look at what you can do with this extremely important and useful log data in Elastic Security. </p> <p>If you’re new to <a href=\"https://www.elastic.co/security\">Elastic Security</a>, you can experience our latest version on <a href=\"https://www.elastic.co/elasticsearch/service\">Elasticsearch Service</a> on Elastic Cloud. Also be sure to take advantage of our <a href=\"https://www.elastic.co/training/elastic-security-quick-start\">Quick Start training</a> to set yourself up for success. </p> <p>See other Cookbook guides that I have written: <a href=\"https://ela.st/tjs-cookbook-lib\">https://ela.st/tjs-cookbook-lib</a>&nbsp; </p>","title":"The essentials of central log collection with WEF and WEC","tags":{"product":["security"],"industry":[],"level":[],"use_case":["security analytics"],"tags":[],"elastic_stack":[],"use_cases":["blt2e5ece40473e6b0a"],"topic":[]},"authors":[{"uid":"blta613d0a3822ba89e","last_name":"Jändling","job_title":"Sr. SA / Global Sec. Specialist Group","company":"Elastic","full_name":"Thorben Jändling","first_name":"Thorben"},{"uid":"blta613d0a3822ba89e","last_name":"Jändling","job_title":"Sr. SA / Global Sec. Specialist Group","company":"Elastic","full_name":"Thorben Jändling","first_name":"Thorben"},{"uid":"blta613d0a3822ba89e","last_name":"Jändling","job_title":"Sr. SA / Global Sec. Specialist Group","company":"Elastic","full_name":"Thorben Jändling","first_name":"Thorben"},{"uid":"blta613d0a3822ba89e","last_name":"Jändling","job_title":"Sr. SA / Global Sec. Specialist Group","company":"Elastic","full_name":"Thorben Jändling","first_name":"Thorben"},{"uid":"blta613d0a3822ba89e","last_name":"Jändling","job_title":"Sr. SA / Global Sec. Specialist Group","company":"Elastic","full_name":"Thorben Jändling","first_name":"Thorben"},{"uid":"blta613d0a3822ba89e","last_name":"Jändling","job_title":"Sr. SA / Global Sec. Specialist Group","company":"Elastic","full_name":"Thorben Jändling","first_name":"Thorben"},{"uid":"blta613d0a3822ba89e","last_name":"Jändling","job_title":"Sr. SA / Global Sec. Specialist Group","company":"Elastic","full_name":"Thorben Jändling","first_name":"Thorben"},{"uid":"blta613d0a3822ba89e","last_name":"Jändling","job_title":"Sr. SA / Global Sec. Specialist Group","company":"Elastic","full_name":"Thorben Jändling","first_name":"Thorben"}],"publish_date":"2021-04-29T18:00:00.000Z","category":["blt1d90b8e0edce3ea9"]}
{"locale":"en-us","url":"/blog/managing-and-troubleshooting-elasticsearch-memory","content":"<p>Hiya! With Elastic’s expansion of our Elasticsearch Service Cloud offering and automated onboarding, we’ve expanded the Elastic Stack audience from full ops teams to data engineers, security teams, and consultants. As an Elastic support rep, I’ve enjoyed interacting with more user backgrounds and with even wider use cases.&nbsp;\n</p><p>With a wider audience, I’m seeing more questions about managing resource allocation, in particular the mystical shard-heap ratio and avoiding circuit breakers. I get it! When I started with the Elastic Stack, I had the same questions. It was my first intro to managing Java heap and time series database shards, and scaling my own infrastructure.\n</p><p>When I joined the Elastic team, I loved that on top of documentation, we had blogs and tutorials so I could onboard quickly. But then I struggled my first month to correlate my theoretical knowledge to the errors users would send through my ticket queue. Eventually, I figured out, like other support reps, that a lot of the reported errors were just symptoms of allocation issues and the same seven-ish links would bring users up to speed to successfully manage their resource allocation.\n</p><p>Speaking as a support rep, in the following sections, I’m going to go over the top allocation management theory links we send users, the top symptoms we see, and where we direct users to update their configurations to resolve their resource allocation issues.\n</p><h2>Theory</h2><p>As a Java application, Elasticsearch requires some logical memory (heap) allocation from the system’s physical memory. This should be up to half of the physical RAM, <a href=\"https://www.elastic.co/guide/en/elasticsearch/guide/current/heap-sizing.html#compressed_oops\">capping at 32GB</a>. Setting higher heap usage is usually <a href=\"https://www.elastic.co/guide/en/cloud/current/ec-memory-pressure.html#ec-memory-pressure-causes\">in response</a> to expensive queries and larger data storage. <a href=\"https://www.elastic.co/guide/en/elasticsearch/reference/current/circuit-breaker.html#parent-circuit-breaker\">Parent circuit breaker</a> defaults to 95%, but we recommend scaling resources once consistently <a href=\"https://www.elastic.co/blog/found-understanding-memory-pressure-indicator#conclusion\">hitting 85%</a>.&nbsp;\n</p><p>I highly recommend these overview articles by our team for more info:\n</p><ul>\n\t<li aria-level=\"1\"><a href=\"https://www.elastic.co/blog/a-heap-of-trouble\">A heap of trouble</a></li>\n\t<li aria-level=\"1\"><a href=\"https://www.elastic.co/guide/en/elasticsearch/guide/current/heap-sizing.html\">Heap: Sizing and swapping</a></li>\n\t<li aria-level=\"1\"><a href=\"https://www.elastic.co/blog/how-many-shards-should-i-have-in-my-elasticsearch-cluster\">How many shards should I have in my Elasticsearch cluster?</a></li>\n</ul><h2>Config</h2><p>Out of the box, Elasticsearch’s <a href=\"https://www.elastic.co/guide/en/elasticsearch/reference/7.11/important-settings.html#heap-size-settings\">default settings</a> automatically size your JVM heap based on node role and total memory. However, as needed, you can configure it directly in the following three ways:\n</p><p>1. Directly in your <a href=\"https://www.elastic.co/guide/en/elasticsearch/reference/7.11/advanced-configuration.html#setting-jvm-heap-size\">config &gt; jvm.options</a> file of your local Elasticsearch files\n</p><pre class=\"prettyprint\">## JVM configuration \n################################################################ \n## IMPORTANT: JVM heap size \n################################################################ \n… \n# Xms represents the initial size of total heap space \n# Xmx represents the maximum size of total heap space \n-Xms4g\n-Xmx4g\n</pre><p>2. As an Elasticsearch environment variable <a href=\"https://www.elastic.co/guide/en/elasticsearch/reference/7.12/docker.html#docker-cli-run-prod-mode\">in your docker-compose</a>\n</p><pre class=\"prettyprint\">version: '2.2'\nservices:\n  es01:\n    image: docker.elastic.co/elasticsearch/elasticsearch:7.12.0\n    environment:\n      - node.name=es01\n      - cluster.name=es\n      - bootstrap.memory_lock=true\n      - \"ES_JAVA_OPTS=-Xms512m -Xmx512m\"\n      - discovery.type=single-node\n    ulimits:\n      memlock:\n        soft: -1\n        hard: -1\n    ports:\n      - 9200:9200\n</pre><p>3. Via our Elasticsearch Service &gt; Deployment &gt; <a href=\"https://www.elastic.co/guide/en/cloud/current/ec-customize-deployment-components.html#ec-cluster-size\">Edit view</a>. Note: The slider assigns physical memory and roughly half will be allotted to the heap.\n</p><p><img src=\"https://images.contentstack.io/v3/assets/bltefdd0b53724fa2ce/blt85a09eec61be50dc/60832291b775555b377675af/blog-elasticsearch-memory.png\" data-sys-asset-uid=\"blt85a09eec61be50dc\" alt=\"blog-elasticsearch-memory.png\" style=\"display: block; margin: auto;\">\n</p><h2>Troubleshooting</h2><p>If you’re currently experiencing performance issues with your cluster, it’ll most likely come down to the usual suspects\n</p><ul>\n\t<li aria-level=\"1\">Configuration issues: Oversharding, no <a href=\"https://www.elastic.co/guide/en/elasticsearch/reference/current/getting-started-index-lifecycle-management.html\">ILM</a> policy</li>\n\t<li aria-level=\"1\">Volume induced: High request pace/load, overlapping expensive queries / writes</li>\n</ul><p>All following cURL / API requests can be made in the Elasticsearch Service &gt; API Console, as a cURL to the Elasticsearch API, or under Kibana &gt; Dev Tools.\n</p><h3>Oversharding</h3><p>Data indices <a href=\"https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules.html\">store into sub-shards</a> which use heap for maintenance and during search/write requests. Shard size should <a href=\"https://www.elastic.co/guide/en/elasticsearch/reference/7.x//size-your-shards.html#shard-size-recommendation\">cap at 50GB</a> and number should cap as <a href=\"https://www.elastic.co/guide/en/elasticsearch/reference/7.x//size-your-shards.html#shard-count-recommendation\">determined via this equation</a>:\n</p><pre class=\"prettyprint\">shards &lt;= sum(nodes.max_heap) * 20\n</pre><p>Taking the above Elasticsearch Service example with 8GB of physical memory across two zones (which will allocate two nodes in total)\n</p><pre class=\"prettyprint\"># node.max_heap \n8GB of physical memory / 2 = 4GB of heap  \n# sum(nodes.max_heap) \n4GB of heap * 2 nodes = 8GB \n# max shards \n8GB * 20 \n160\n</pre><p>Then cross-compare this to either <a href=\"https://www.elastic.co/guide/en/elasticsearch/reference/current/cat-allocation.html\"><strong>_cat/allocation</strong></a>\n</p><pre class=\"prettyprint\">GET /_cat/allocation?v=true&h=shards,node\nshards node\n    41 instance-0000000001\n    41 instance-0000000000\n</pre><p>Or to <a href=\"https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-health.html\"><strong>_cluster/health</strong></a>\n</p><pre class=\"prettyprint\">GET /_cluster/health?filter_path=status,*_shards\n{\n  \"status\": \"green\",\n  \"unassigned_shards\": 0,\n  \"initializing_shards\": 0,\n  \"active_primary_shards\": 41,\n  \"relocating_shards\": 0,\n  \"active_shards\": 82,\n  \"delayed_unassigned_shards\": 0\n}\n</pre><p>So this deployment has 82 shards of the max 160 recommendation. If the count was higher than the recommendation, you may experience the symptoms in the next two sections (see below).\n</p><p>If any shards report &gt;0 outside <em>active_shards</em> or <em>active_primary_shards</em>, you’ve pinpointed a major config cause for performance issues.\n</p><p>Most commonly if this reports an issue, it’ll be <em>unassigned_shards&gt;0</em>. If these shards are primary, your cluster will report as <em>status:red</em>, and if only replicas it’ll report as <em>status:yellow</em>. (This is why <a href=\"https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules.html#dynamic-index-settings\">setting replicas on indices</a> is important so that if the cluster encounters an issue it can recover rather than experience data loss.)\n</p><p>Let’s pretend we have a <em>status:yellow</em> with a single unassigned shard. To investigate, we’d take a look at which index shard is having trouble via <a href=\"https://www.elastic.co/guide/en/elasticsearch/reference/current/cat-shards.html\"><strong>_cat/shards</strong></a>\n</p><pre style=\"white-space: pre !important;display: -webkit-box;\" class=\"prettyprint\">GET _cat/shards?v=true&s=state\nindex                                     shard prirep state        docs   store ip           node\nlogs                                      0     p      STARTED         2  10.1kb 10.42.255.40 instance-0000000001\nlogs                                      0     r      UNASSIGNED\nkibana_sample_data_logs                   0     p      STARTED     14074  10.6mb 10.42.255.40 instance-0000000001\n.kibana_1                                 0     p      STARTED      2261   3.8mb 10.42.255.40 instance-0000000001\n</pre><p>So this will be for our non-system index logs, which has an unassigned replica shard. Let’s see what’s giving it grief by running <a href=\"https://www.elastic.co/guide/en/elasticsearch/reference/7.12/cluster-allocation-explain.html\"><strong>_cluster/allocation/explain</strong></a><strong> </strong>(Pro tip: When you escalate to support, this is <em>exactly</em> what we do)\n</p><pre style=\"white-space: pre !important;display: -webkit-box;\" class=\"prettyprint\">GET _cluster/allocation/explain?pretty&filter_path=index,node_allocation_decisions.node_name,node_allocation_decisions.deciders.*\n{ \"index\": \"logs\",\n  \"node_allocation_decisions\": [{\n      \"node_name\": \"instance-0000000005\",\n      \"deciders\": [{\n          \"decider\": \"data_tier\",\n          \"decision\": \"NO\",\n          \"explanation\": \"node does not match any index setting [index.routing.allocation.include._tier] tier filters [data_hot]\"\n}]}]}\n</pre><p>This error message points to <em>data_hot</em>, which is part of an <a href=\"https://www.elastic.co/guide/en/elasticsearch/reference/current/index-lifecycle-management.html\">index lifecycle management</a> (ILM) policy and indicates that our ILM policy is incongruent with our current index settings. In this case, the cause of this error is from setting up a hot-warm ILM policy without having designated hot-warm nodes. (I needed to guarantee something would fail, this is me forcing error examples for y’all. See what you’ve done to me 😂.)\n</p><p>FYI, if you run this command when you don’t have any unassigned shards, you’ll get a 400 error saying <em>unable to find any unassigned shards to explain</em> because nothing’s wrong to report on<em>.</em>\n</p><p>If you get a non-logic cause (e.g., a temporary network error like <em>node left cluster during allocation</em>), then you can use Elastic’s handy-dandy <a href=\"https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-reroute.html\"><strong>_cluster/reroute</strong></a><strong>.</strong>\n</p><p><strong></strong>\n</p><pre class=\"prettyprint\">POST /_cluster/reroute\n</pre><p><strong></strong>\n</p><p>This request without customizations starts an asynchronous background process that attempts to allocate all current <em>state:UNASSIGNED</em> shards. (Don’t be like me and not wait for it to finish before you contact dev because I thought it would be instantaneous and coincidentally escalate just in time for them to say nothing’s wrong because nothing was anymore.)\n</p><h3>Circuit breakers</h3><p>Maxing out your heap allocation can cause requests to your cluster to timeout or error and frequently will cause your cluster to experience circuit breaker exceptions. Circuit breaking causes <strong>elasticsearch.log</strong> events like\n</p><pre class=\"prettyprint\">Caused by: org.elasticsearch.common.breaker.CircuitBreakingException: [parent] Data too large, data for [&lt;transport_request&gt;] would be [num/numGB], which is larger than the limit of [num/numGB], usages [request=0/0b, fielddata=num/numKB, in_flight_requests=num/numGB, accounting=num/numGB]\n</pre><p>To investigate, take a look at your <em>heap.percent</em>, either by looking at <a href=\"https://www.elastic.co/guide/en/elasticsearch/reference/current/cat-nodes.html\"><strong>_cat/nodes</strong></a>\n</p><pre class=\"prettyprint\">GET /_cat/nodes?v=true&h=name,node*,heap*\n# heap = JVM (logical memory reserved for heap)\n# ram  = physical memory\nname                                node.role heap.current heap.percent heap.max\ntiebreaker-0000000002 mv             119.8mb           23    508mb\ninstance-0000000001   himrst           1.8gb           48    3.9gb\ninstance-0000000000   himrst           2.8gb           73    3.9gb\n</pre><p>Or if you’ve previously enabled it, by navigating to Kibana &gt; Stack Monitoring.\n</p><p><img src=\"https://images.contentstack.io/v3/assets/bltefdd0b53724fa2ce/bltb3f4f2c5cdb2823c/608322c4eed217453bf28487/blog-elasticsearch-memory-2.png\" data-sys-asset-uid=\"bltb3f4f2c5cdb2823c\" alt=\"blog-elasticsearch-memory-2.png\">\n</p><p>If you’ve confirmed you’re hitting your memory circuit breakers, you’ll want to consider increasing heap temporarily to give yourself breathing room to investigate. When investigating root cause, look through your cluster proxy logs or elasticsearch.log for the preceding consecutive events. You’ll be looking for\n</p><ul>\n\t<li aria-level=\"1\">expensive queries, especially:\n\t<ul>\n\t\t<li aria-level=\"2\">high bucket aggregations\n\t\t<ul>\n\t\t\t<li aria-level=\"3\">&nbsp;I felt so silly when I found out that searches temporarily allocate a certain port of your heap <em>before</em> they run the query based on the search <em>size</em> or bucket dimensions, so setting 10,000,000 really was giving my ops team heartburn.</li>\n\t\t</ul></li>\n\t\t<li aria-level=\"2\">non-optimized mappings\n\t\t<ul>\n\t\t\t<li aria-level=\"3\">The second reason to feel silly was when I thought doing hierarchical reporting would search better than flattened out data (it does not).</li>\n\t\t</ul></li>\n\t</ul></li>\n\t<li aria-level=\"1\">Request volume/pace: Usually batch or async queries</li>\n</ul><h3>Time to scale</h3><p>If this isn’t your first time hitting circuit breakers or you suspect it’ll be an ongoing issue (e.g., consistently hitting 85%, so it’s time to look at scaling resources), you’ll want to take a closer look at <a href=\"https://www.elastic.co/blog/found-understanding-memory-pressure-indicator\">the JVM Memory Pressure</a> as your long-term heap indicator. <a href=\"https://www.elastic.co/guide/en/cloud/current/ec-memory-pressure.html\">You can check this</a> in Elasticsearch Service &gt; Deployment\n</p><p><img src=\"https://images.contentstack.io/v3/assets/bltefdd0b53724fa2ce/blt42b0179b9a3b7782/608323041cac355a10f6d157/blog-elasticsearch-memory-3.png\" data-sys-asset-uid=\"blt42b0179b9a3b7782\" alt=\"blog-elasticsearch-memory-3.png\">\n</p><p>Or you can calculate it from <a href=\"https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-nodes-stats.html\"><strong>_nodes/stats</strong></a>\n</p><pre class=\"prettyprint\">GET /_nodes/stats?filter_path=nodes.*.jvm.mem.pools.old\n{\"nodes\": { \"node_id\": { \"jvm\": { \"mem\": { \"pools\": { \"old\": {\n  \"max_in_bytes\": 532676608,\n  \"peak_max_in_bytes\": 532676608,\n  \"peak_used_in_bytes\": 104465408,\n  \"used_in_bytes\": 104465408\n}}}}}}}\n</pre><p>where\n</p><pre class=\"prettyprint\">JVM Memory Pressure = used_in_bytes / max_in_bytes\n</pre><p>A potential symptom of this is high frequency and long duration from garbage collector (gc) events in your elasticsearch.log\n</p><pre class=\"prettyprint\">[timestamp_short_interval_from_last][INFO ][o.e.m.j.JvmGcMonitorService] [node_id] [gc][number] overhead, spent [21s] collecting in the last [40s]\n</pre><p>If you confirm this scenario, you’ll need to take a look either at scaling your cluster or at reducing the demands hitting it. You’ll want to investigate/consider:\n</p><ul>\n\t<li aria-level=\"1\">increasing heap resources (heap/node, number of nodes)</li>\n\t<li aria-level=\"1\">decreasing shards (delete unnecessary/old data, <a href=\"https://www.elastic.co/guide/en/elasticsearch/reference/current/getting-started-index-lifecycle-management.html\">use ILM</a> to put data into <a href=\"https://www.elastic.co/blog/hot-warm-architecture\">warm/cold storage</a> so you can <a href=\"https://www.elastic.co/guide/en/elasticsearch/reference/current/ilm-shrink.html\">shrink it</a>, turn off replicas for data you don’t care if you lose)</li>\n</ul><h2>Conclusion</h2><p>Wooh! From what I see in Elastic Support, that’s the rundown of most common user tickets: unassigned shards, unbalanced shard-heap, circuit breakers, high garbage collection, and allocation errors. All are symptoms of the core resource allocation management conversation. Hopefully, you now know the theory and resolution steps, too.\n</p><p>At this point, though, if you’re stuck resolving an issue, feel free to reach out. We’re here and happy to help! You can contact us via <a href=\"http://discuss.elastic.co/\">Elastic Discuss</a>, <a href=\"https://join.slack.com/t/elasticstack/shared_invite/zt-o4sdlhb7-OGXEcy4iry_CsxVyJLGYag\">Elastic Community Slack</a>, consulting, training, and support.\n</p><p>Cheers to our ability to self-manage Elastic Stack’s resource allocation as non-Ops (love Ops, too)!\n</p>","title":"Managing and troubleshooting Elasticsearch memory","tags":{"product":["elasticsearch","kibana","elastic stack"],"industry":[],"level":["advanced"],"use_case":[],"tags":[],"elastic_stack":["blt3d820a0eae1c9158","blt8b37b4b3ec0fe838"],"use_cases":[],"topic":["blt7731091cfa6e23e8","blt30953f4176054d3f"]},"authors":[{"uid":"bltddff0459e563bc78","last_name":"Nestor","job_title":"","company":"","full_name":"Stef Nestor","first_name":"Stef"}],"publish_date":"2021-04-27T15:00:00.000Z","category":["blt1d90b8e0edce3ea9"]}
