-
Notifications
You must be signed in to change notification settings - Fork 810
SOLR-18044 EssentialSolrRequestFilter to perform non-optional request wrapping operations #3997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
4877f19
a951b6a
d9097e5
fc9cad6
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
This file was deleted.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,8 @@ | ||
| # See https://github.com/apache/solr/blob/main/dev-docs/changelog.adoc | ||
| title: EssentialSolrRequestFilter has been added to perform non-optional request wrapping operations required for proper functioning of Solr | ||
| type: other # added, changed, fixed, deprecated, removed, dependency_update, security, other | ||
| authors: | ||
| - name: Gus Heck | ||
| links: | ||
| - name: SOLR-18044 | ||
| url: https://issues.apache.org/jira/browse/SOLR-18044 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,97 @@ | ||
| /* | ||
| * Licensed to the Apache Software Foundation (ASF) under one or more | ||
| * contributor license agreements. See the NOTICE file distributed with | ||
| * this work for additional information regarding copyright ownership. | ||
| * The ASF licenses this file to You under the Apache License, Version 2.0 | ||
| * (the "License"); you may not use this file except in compliance with | ||
| * the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
| package org.apache.solr.servlet; | ||
|
|
||
| import jakarta.servlet.FilterChain; | ||
| import jakarta.servlet.ServletException; | ||
| import jakarta.servlet.http.HttpServletRequest; | ||
| import jakarta.servlet.http.HttpServletResponse; | ||
| import java.io.IOException; | ||
| import java.lang.invoke.MethodHandles; | ||
| import org.apache.solr.common.util.SuppressForbidden; | ||
| import org.apache.solr.logging.MDCLoggingContext; | ||
| import org.apache.solr.logging.MDCSnapshot; | ||
| import org.apache.solr.request.SolrRequestInfo; | ||
| import org.apache.solr.util.RTimerTree; | ||
| import org.slf4j.Logger; | ||
| import org.slf4j.LoggerFactory; | ||
|
|
||
| /** | ||
| * Servlet Filter to set up and tear down a various things that downstream code in solr relies on. | ||
| * It is expected that solr will fail to function as intended without the conditions initialized in | ||
| * this filter. Before adding anything to this filter ask yourself if a user could run without the | ||
| * feature you are adding (either with an alternative implementation or without it at all to reduce | ||
| * cpu/memory/dependencies). If it is not essential, it should have its own separate filter. Also, | ||
| * please include a comment indicating why the thing added is essential. | ||
| */ | ||
| public class EssentialSolrRequestFilter extends CoreContainerAwareHttpFilter { | ||
|
|
||
| // Best to put constant here because solr is not supposed to be functional (or compile) | ||
| // without this filter. | ||
| public static final String CORE_CONTAINER_REQUEST_ATTRIBUTE = "org.apache.solr.CoreContainer"; | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. minor: just do
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah took me a moment, but I remember why I didn't do that. I had feared that perhaps the HttpRequest was accessible from one or more of our key plugin types, but your question made me go verify if that was true and I'm happy to have found that that fear was unfounded, so there's no risk that plugin authors (except perhaps anyone who wrote an auth plugin) has discovered and relied on that attribute (which has been in the request attributes since SOLR-4265 in 2013). However, now that I've reassured myself I agree with your suggestion, and no longer worry that it needs to remain constant for all time even if someday a package changes etc. TLDR: will fix :) |
||
| private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); | ||
|
|
||
| @Override | ||
| @SuppressForbidden( | ||
| reason = | ||
| "Set the thread contextClassLoader for all 3rd party dependencies that we cannot control") | ||
| protected void doFilter(HttpServletRequest req, HttpServletResponse res, FilterChain chain) | ||
| throws IOException, ServletException { | ||
| ClassLoader contextClassLoader = Thread.currentThread().getContextClassLoader(); | ||
| // this autocloseable is here to invoke MDCSnapshot.close() which restores captured state | ||
| try (var mdcSnapshot = MDCSnapshot.create()) { | ||
|
|
||
| // MDC logging *shouldn't* be essential but currently is, see SOLR-18050. | ||
| // Our use of SLF4J indicates that we intend logging to be pluggable, and | ||
| // some implementations won't have an MDC, so having it as essential limits | ||
| // logging implementations. | ||
|
Comment on lines
+58
to
+61
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for filing SOLR-18050. It would be nice to get that fixed sometime. But why do you reference that here? Even JUL, which doesn't have MDC, nonetheless effectively does via SLF4J as seen here. Even if it didn't, I'd expect no-op behavior. Thus, we need the code you are lamenting is in this filter.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess SLF4J impls probably have to handle it's absence gracefully, so perhaps it's not as big a deal as my initial thought.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IMO could/should do the same with RTimer |
||
| log.trace("MDC snapshot recorded {}", mdcSnapshot); // avoid both compiler and ide warning. | ||
| MDCLoggingContext.reset(); | ||
| MDCLoggingContext.setNode(getCores()); | ||
|
|
||
| // This is essential to accommodate libraries that (annoyingly) use | ||
| // Thread.currentThread().getContextClassLoader() | ||
| Thread.currentThread().setContextClassLoader(getCores().getResourceLoader().getClassLoader()); | ||
|
|
||
| // set a request timer which can be reused by requests if needed | ||
| // Request Timer is essential for QueryLimits functionality as well as | ||
| // timing our requests. | ||
| req.setAttribute(SolrRequestParsers.REQUEST_TIMER_SERVLET_ATTRIBUTE, new RTimerTree()); | ||
|
|
||
| // put the core container in request attribute | ||
| // This is essential for the LoadAdminUiServlet class. Removing it will cause 404 | ||
| req.setAttribute(CORE_CONTAINER_REQUEST_ATTRIBUTE, getCores()); | ||
| chain.doFilter(req, res); | ||
| } finally { | ||
| // cleanups for above stuff | ||
| MDCLoggingContext.reset(); | ||
| Thread.currentThread().setContextClassLoader(contextClassLoader); | ||
|
|
||
| // This is an essential safety valve to ensure we don't accidentally bleed information | ||
| // between requests. | ||
| SolrRequestInfo.reset(); | ||
| if (!req.isAsyncStarted()) { // jetty's proxy uses this | ||
|
|
||
| // essential to avoid SOLR-8453 and SOLR-8683 | ||
| ServletUtils.consumeInputFully(req, res); | ||
|
|
||
| // essential to remove temporary files created during multipart requests. | ||
| SolrRequestParsers.cleanupMultipartFiles(req); | ||
| } | ||
| } | ||
| } | ||
| } | ||
This file was deleted.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -46,7 +46,7 @@ | |
|
|
||
| <filter> | ||
| <filter-name>MDCLoggingFilter</filter-name> | ||
| <filter-class>org.apache.solr.servlet.MdcLoggingFilter</filter-class> | ||
| <filter-class>org.apache.solr.servlet.EssentialSolrRequestFilter</filter-class> | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Mostly for my own understanding, is there a reason not to have ~3 filters here instead of one? One for mdc, one for rtimer, etc? I would guess there is some overhead with each filter, but I would expect that each filter listed here is "required" so why shove them all into one filter?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This was my initial instinct, but feedback so far (mostly by David) was that too many tiny filters was undesirable. |
||
| </filter> | ||
|
|
||
| <filter-mapping> | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really dislike the name "Essential". Maybe swap in "Primary". IMO it's essentialness isn't very relevant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Primary doesn't seem to communicate much since it's a relative word with no context to allow the reader to understand what else would be secondary or tertiary... Would "Required" work for you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do like Required better.
Is your ultimate plan to put additional filters downstream of this one? If not, then this filter doesn't serve much use separated from SolrDispatchFilter.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, SOLR-18040 (the parent issue) lays out mostly what I'm going to turn into filters. So the plan is to have filters for at least:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The hope is that once that is in place the dispatch can be directing to one (or eventually more) servlets for actual search work... One (not well validated) though in the back of my head is that the Admin path, the update path and the different flavors of HttpCall really could be separate servlets (likely with a common base class(es)) and dispatch filter could be reduced to deciding which one to forward to after supporting/adapting our legacy path structures. Then a future project is to reduce the dependency on Http wherever possible so that other protocols like websockets could be added. This also hopefully makes experimenting with alternate authentication/authorization systems easier. (not sure if we can split our current auth stuff into authn and authz, but I'd make separate filters there if I find that to be possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
splitting authn/authz isn't critical if it becomes isolated to a filter. I am aware that there are methods all over returning required permission stuff, but the goal is to get the code that acts on that info segregated such that it can be replaced. Making permission details pluggable would be a later thing if desired.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A major change to SolrDispatchFilter & HttpSolrCall certainly requires a dev list discussion IMO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've mentioned this sort of thing many times before on the list and in conversations for the last 3-4 years. Almost everyone seems to agree that simplifying SolrDispatchFilter is good. The only opposition ever cited was that perhaps we might someday abandon jetty altogether. I don't see anyone tackling that anytime soon, and honestly that effort will probably benefit from not having to pick apart a monster filter whenever it does show up since this direction would translate pretty nicely into a standard decorator pattern. We can call a halt and have another discussion if it seems desirable.
Also, I feel that the filter breakout is good even if we never do the rest of my vision (and that's why my tickets stop there, further changes beyond what I have ticketed are more drastic and certainly deserve discussion), and unless I make a mistake, it should be entirely user transparent (except opening up new possibilities for customization by power users). In our recent conversation you suggested that you were looking to move things to Servlet so I was motivated to do this task that I've been thinking about and fiddling with repeatedly to help minimize what you would have to deal with. You did start some conversation on this back in October of 2024, Only dissent was @hossman noting that things used to be that way before SOLR-104, but it sounds like jsp based UI was a key motivator at the time and that's long gone.
I also think that if we ever do ditch Jetty, having separate servlets for update/query/admin (not part of the current tickets) would make it that much easier to assign those to other ports or protocols or whatever. Right now it's all coupled and firmly tied to http via HttpSolrCall
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect I sounded oppositional. I am not. At this point, SolrDispatchFilter is no monster it maybe once was, but nonetheless I welcome better design / decoupling. I really want to see that converted to a Servlet, and I'll raise that to the dev list in the coming months, waiting for your epic to finish and critically a shelved change I have to stop HttpSolrCall from filtering/pass-through.
HttpSolrCall/V2 is a spaghetti monster.